Conference Co-Chairs’ Message 


Welcome to Japan and WWW2005. This is the 14th conference in the series that was started by 
Robert Cailliau in Geneva in 1994. It is also worth noting that the conference has now been held 
in each of the three major W3C host countries (USA, France, and Japan). 


This year’s conference maintains the traditional 1:3:1 format including: a day of tutorial and 
workshop sessions, three days of keynotes, paper tracks and poster presentations as well as the 
special W3C track, and finishes with Developers Day. 


Once again we are delighted to have Tim Berners-Lee deliver the opening keynote and to be fol- 
lowed by four internationally distinguished speakers in the other plenary session slots. This year 
Developers Day also begins with a special keynote address. 


The resources and staff of Keio University, the W3C Japan Host and the local Professional Confer- 
ence Organizer have worked with a large number of volunteers and members of the International 
World Wide Web Conference Committee (IW3C2), the group that administer the conference series, 
to organise this year’s conference. This work has been supported by our much valued conference 
partners and sponsors. 


Conferences of this nature happen for three reasons: a spirit of inquiry about the subject; the 
willingness of a group of authors write up and share their ideas and research findings; and, a group 
of organisers (mostly volunteers) who are prepared to put in the time and effort to build and 
management a conference program and run a conference environment. Conferences are team efforts 
— our thanks to everyone who has been a part of the WWW2005 team. 


Finally, we hope you find the conference stimulating and professionally rewarding and that you 
enjoy your stay in Japan and have a safe trip home. 


Allan Ellis Tatsuya Hagino 
Southern Cross University Keio University 
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Mark Baker, Coactus, Canada (Co-Chair) 
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Keynote Speakers 


Tim Berners-Lee 
Director of World Wide Web Consortium 


Tim Berners-Lee is a graduate of Oxford University, England, and 
currently holds the 3Com Founders chair at the Computer Science 
and Artificial Intelligence Laboratory (CSAIL) at the Massachusetts 
Institute of Technology (MIT). He directs the World Wide Web Con- 
sortium (W3C), an open forum of companies and organizations with 
the mission to lead the Web to its full potential through the develop- 
ment of Web technical standards, which he founded in October 1994. 


With a background of system design in real-time communications and 
text processing software development, Tim invented the World Wide 
Web, an internet-based hypermedia initiative for global information 
sharing while working at CERN, the European Particle Physics Lab- 
oratory. He wrote the first version of HTML, as well as the first web client (browser-editor) and 
server in 1990. 


Subsequent honors include a MacArthur Fellowship, the ACM Software Systems Award, IEEE Koji 
Kobayashi Computers and Communications Award, the Albert Medal of the Royal Society for the 
encouragement of Art, Manufactures and Commerce, the Japan Prize and the Finnish Millennium 
Technology Prize. 


He is a Distinguished Fellow of the British Computer Society, and a Honorary Fellow of the In- 
stitution of Electrical Engineers., a member of the American Academy of Arts and Sciences, and 
a Fellow of the Royal Society. In 2004, Tim was made a Knight Commander of the Order of the 
British Empire (KBE). 


Eric Brewer 

Associate Professor, Computer Science Division 
Alfred P. Sloan Research Fellow 

University of California at Berkeley 


Dr. Brewer focuses on all aspects of Internet-based systems, includ- 
ing technology, strategy, and government. As a researcher, he has led 
projects on scalable servers, search engines, network infrastructure, 
sensor networks, and security. His current focus in (high) technology 
for developing regions, with projects in India and Bangladesh (so far), 
and including communications, health, education, and e-government. 


In 1996, he co-founded Inktomi Corporation with a Berkeley grad 
student based on their research prototype, and helped lead it onto the 
Nasdaq 100 before it was bought by Yahoo! in March 2003. 


In 2000, he founded the Federal Search Foundation, a 501-3(c) organization focused on improving 
consumer access to government information. Working with President Clinton, Dr. Brewer helped to 
create FirstGov.gov, the official portal of the Federal government, which launched in September 
2000. 


Keynote Speakers 


He received an MS and Ph.D. in EECS from the Massachusetts Institute of Technology, and a 
BS in EECS from UC Berkeley. He was named a “Global Leader for Tomorrow” by the World 
Economic Forum, by the Industry Standard as the “most influential person on the architecture of 
the Internet”, by InfoWorld as one of their top ten innovators, by Technology Review as one of the 
top 100 most influential people for the 21st century (the “TR100”), and by Forbes as one of their 
12 “e-mavericks” , for which he appeared on the cover. 


Lorrie Cranor 
Associate Research Professor 
Carnegie Mellon University 


Dr. Lorrie Faith Cranor is an Associate Research Professor in the 
School of Computer Science at Carnegie Mellon University in Pitts- 
burgh, Pennsylvania. She is a faculty member in the Institute for 
Software Research, International and in the Engineering and Public 
Policy department. She is director of the CMU Usable Privacy and 
Security Laboratory (CUPS). She came to CMU in December 2003 
after seven years at AT&T Labs-Research. While at AT&T she also 
taught in the Stern School of Business at New York University. 


Dr. Cranor’s research has focused on a variety of areas where technology and policy issues interact, 
including online privacy, electronic voting, and spam. She is chair of the Platform for Privacy 
Preferences Project (P3P) Specification Working Group at the World Wide Web Consortium and 
author of the book Web Privacy with P3P (O’Reilly 2002). In 2003 she was named one of the top 
100 innovators 35 or younger by Technology Review magazine. 


Dr. Cranor received her doctorate degree in Engineering & Policy from Washington University in St. 
Louis in 1996. While in graduate school she helped found Crossroads, the ACM Student Magazine, 
and served as the publication’s editor-in-chief for two years. 


Dr. Cranor was chair of the Tenth Conference on Computers Freedom and Privacy (CFP2000) and 
program committee chair for the 29th Research Conference on Communication, Information and 
Internet Policy (TPRC 2001). In the Spring of 2000 she served on the Federal Trade Commission 
Advisory Committee on Online Access and Security. She also serves on the editorial boards of 
the journals ACM Transactions on Internet Technology, The Information Society, and Journal of 
Privacy Technology. 


Dr. Cranor has been studying electronic voting systems since 1994 and in 2000 served on the 
executive committee of a National Science Foundation sponsored Internet voting taskforce. 


Dr. Cranor was also a member of the project team that developed the Publius censorship-resistant 
publishing system. In February 2001, the Publius team was honored by Index on Censorship 
magazine for the “Best Circumvention of Censorship.” 


Dr. Cranor spends most of her free time with her husband, Chuck, her son, Shane, and her daughter 
Maya, but sometimes she finds time to play the tenor saxophone or design and create award-winning 
quilts. 


Keynote Speakers 


Rob Glaser 
Chairman and CEO 
RealNetworks, Inc. 


Rob Glaser, founder and CEO of RealNetworks, Inc. (NASDAQ: 
RNWK) — the recognized leader in Internet media delivery, has long 
been intrigued with the nexus of media, computing, communication 
and the Internet. Since founding RealNetworks in 1995, Glaser has 
played an integral role in the transformation of the Internet into the 
next great mass medium. In 1995 under Glaser’s direction, RealNet- 
works introduced the groundbreaking RealAudio, RealVideo, RealPlayer and RealSystem technolo- 
gies, effectively transitioning television and radio from broadcast to the Web. With the launch of 
RealJukebox in 1999, RealNetworks secured its leadership position in the digital distribution of 
music. 


In 2001, RealNetworks introduced the revolutionary RealOne, an all-in-one service and technology 
platform, the single source for consumers to discover, play and manage the best in brand-name 
digital programming - music, entertainment, sports, news, and more. Since its launch, RealOne 
has become the fastest growing Internet paid media subscription service in history, with more than 
half-million subscribers in less than eighteen months. 


Prior to founding RealNetworks, Mr. Glaser worked for Microsoft from 1983 to 1993 in a number 
of executive positions, including Vice President of Multimedia and Consumer Systems. 


Mr. Glaser has served on several non-profit boards and committees, including his appointment by 
President Clinton to the Advisory Committee on Public Interest Obligations of Digital Television 
Broadcasters. 


Mr. Glaser is a graduate of Yale University, with a BA and an MA in Economics and a BS in 
Computer Science. 


Yuji Inoue 
Senior Vice President 
NTT 


Yuji Inoue was born in Fukuoka, Japan, in 1948. He received the 
B.E., M.E. and Ph. D degrees from Kyushu University, Fukuoka, 
Japan, in 1971, 1973 and 1986, respectively. He was made an Honorary 
Professor of Mongolian Technical University in 1999. He joined NTT 
(Nippon Telegraph and Telephone Corporation) Laboratories in 1973. 
He was first engaged in the development of digital network equipment 
and systems, such as digital synchronization, digital switching and 
digital subscriber loop transmission, and later in the standardization 
of narrow and broadband ISDN (Integrated Services Digital Network), 
SDH (Synchronous Digital Hierarchy) and TNA (Transport Network 
Architecture) through the international standards organization, ITU- 
T. He was the Special Rapporteur of Study Group XVIII of the ITU-T, formerly CCITT, and he 
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co-led SDH and TNA as the first worldwide unique standards in these fields. While conducting 
multimedia experiments in Japan, he co-initiated the next generation software architecture called 
Telecommunication Information Networking Architecture, TINA, in the Consortium of which he 
served as the Chairperson of its Technical Forum for six years from its establishment, 1993 - 1998. 
In 1997, he joined the global business incubation activities of NTT as the Leader of the Global Info- 
communications Service Development Project. After launching advanced Internet-based networking 
services in NTT’s global business area, he moved back to the Laboratories in July 1998 as the 
Executive Manager of NTT Multimedia Networks Laboratories, where he conducted leading-edge 
studies related to Information Sharing services and platforms. 


He joined NTT Data Corporation as the Deputy Senior Executive Manager of Research and De- 
velopment (R&D) Headquarters, as part of NTT’s re-organization in July 1999. He was also a 
Corporate Senior Vice-President and served as the Chief IT Partner for the IT Business Naviga- 
tion Group, newly established in September 2000. In June 2001, he became the Senior Executive 
Manger of R&D Headquarters and of the Intellectual Property Office in R&D Headquarters. He 
additionally served as the Executive Manager of the Planning Department in R&D Headquarters 
from April 2002. 


Dr. Yuji Inoue moved back to NTT in June 2002, and is now serving as a Senior Vice-President of 
NTT and the Executive Director of Department III (R&D Strategy Department). 


He is an IEICE Fellow and also an IEEE Fellow. He has co-authored several books including 
“ISDN”, “Broadband ISDN and ATM Technologies”, “Network Architecture”, “The TINA Book”, 
“NTT’s Strategy for Global Information Sharing” and “Waves leading to Future Networks.” 
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TITLE PRESENTERS LOCATION 
WFO1 | 2nd International Cross-Disciplinary | Simon Harper, Yeliz Yesilada, 201A 
Workshop on Web Accessibility and Carole Goble 
WFO4 | The Semantic Computing Initiative | Mitsuru Ishizuka and Koiti 201B 
— From Semantic Web to Semantic | Hasida 
World 
WFO5 | Interoperability of Web-based Educa- | Daniel Olmedilla, Nobuo 302 
tional Systems Saito, and Bernd Simon 
WFO6 | AIRWeb'05 — Adversarial Informa- | Brian Davison 301B 
tion Retrieval on the Web 
WFO7 | Innovations in Web Infrastructure | Simon Courtenage, Boris Gal- 204 
(IWI) itsky, and David Lewis 
WFO8 | Web Service Semantics: Towards Dy- | Christoph Bussler, Richard 303 
namic Business Integration Goodwin, Rubén Lara, David 
Martin, and Takahira Yam- 
aguchi 
WFO9 | Policy Management for the World | Tim Finin, Jim Hendler, and 205 
Wide Web Lalana Kagal 
WF10 | 2nd Annual Workshop on the We- | Natalie Glance, Matthew 304 
blogging Ecosystem — Aggregation, | Hurst, and Eytan Adar 
Analysis and Dynamics 
WF11 | Activities on Semantic Web Technolo- | Noboru Shimizu, and Hideaki | Rindo (East) 
gies in Japan Takeda 
WF12 | Customer Focused Mobile Services Johan Hjelm, Annakaisa | Rindo (West) 
Hayrynen, Bin Wei and 
Rittwik Jana 
WFO1 | 2nd International Cross-Disciplinary Workshop on Web Accessibil- 


ity 
Simon Harper, Yeliz Yesilada, and Carole Goble, University of Manchester 
Location: 201A 


Conventional workshops on accessibility tend to be single disciplinary in nature. However, we are 
concerned that this focus on a single participant group prevents the cross-pollination of ideas, needs, 
and technologies from other related but separate fields. This workshop will be decidedly cross- 
disciplinary and will bring together users, accessibility experts, graphic designers, and technologists 
from academia and industry to discuss how accessibility can be supported. We also encourage the 
participation of users and other interested parties as an additional balance to the discussion. Our 
aim is to focus on accessibility by encouraging participation from many disciplines. Views will 
bridge academia, commerce, and industry and we hope that arguments encompassing a range of 
beliefs across the design-accessibility spectrum will be presented. 


Last year’s workshop outcomes suggested a number of possible themes for the 2005 edition. The 
theme for this second workshop, ‘Engineering Accessible Design’, was the most requested topic 
for further discussions by our 2004 participants. Previous engineering approaches seem to have 
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precluded the engineering of accessible systems. This is plainly unsatisfactory. Designers, authors, 
and technologist are at present playing ‘catch-up’ with a continually moving target in an attempt 
to retrofit systems. In-fact engineering accessible interfaces is as important as their functionality’s 
and should be an indivisible part of the development. We should be engineering accessibility as 
part of the development and not as afterthought or because government restrictions and civil law 
requires us to. Our workshop will bringing together a cross section of the web design and engineering 
communities; to report on developments, discuss the issues, and suggest cross-pollinated solutions. 


WF04| The Semantic Computing Initiative - From Semantic Web to Se- 
mantic World 

Koiti Hasida, Information Technology Research Institute, AIST 

Mitsuru Ishizuka, University of Tokyo 

Location: 201B 


Semantic Computing is a vision of information technology based on semantics shared between peo- 
ple and machines, aiming at making computers more usable and useful to everybody. All the 
information content including not just Web pages but also software, document, and multimodal 
content should have explicit semantic structure, which would make it straightforward both to tell 
computers what people mean and to provide information services meaningful to people. For in- 
stance, incorporation of semantic structure from the authoring stage will both reduce the cost of 
authoring and improve the quality of the content (clarity of document, validity of program, and so 
on). 


Semantic Computing extends Semantic Web (in the narrow sense of ontology-based augmentation 
of Web pages) in terms of both breadth (Semantic Computing encompasses not just the Web but 
the entire IT) and depth (it addresses not only skeletal meaning of Web pages but detailed semantic 
structure of natural language, multimodal data, programming language, etc.), hence semantically 
enriching a much larger realm of the human life-world. Technologies including software engineering, 
user interface, natural-language processing, artificial intelligence, grid computing, and ubiquitous 
computing, among others, need be integrated to embody this initiative. The workshop hence invites 
interested experts to share their new ideas on topics including, but not limited to: 


e Integration of ontology-based description and semantic annotation; 
e Middleware platform for Semantic Computing; 


e Applications and business models based on Semantic Computing. 


WFO5 | Interoperability of Web-based Educational Systems 
Daniel Olmedilla, L3S Research Center 

Nobuo Saito, Keio University 

Bernd Simon, Vienna University of Economics and Business Administration 
Location: 302 


Nowadays learning resources are increasingly available via web-based educational systems, such as 
learning (content) management systems, electronic market places for learning materials and courses, 
or knowledge repositories. With the dawn of various specialised e-learning tools, learning resources 
became more and more stored in closed environments, restricting accessibility to a closed user 
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community. While standardization bodies and consortia such as ADL, CEN/ISSS, IEEE, IMS, and 
ISO have already identified the need for interoperability of web-based educational systems, learners’ 
choices to fill a particular knowledge gap are in many cases still limited to the offers of the system 
they are registered at. 


Recently, researchers have started to focus in these issues in more depth. Web technologies have 
appeared as promising approaches where XML, RDF, Web query languages, and ontology-based 
data integration approaches became essential ingredients of this infrastructure. 


WFO6 | AIRWeb*05 — Adversarial Information Retrieval on the Web 
Brian D. Davison, Lehigh University 
Location: 301B 


Search is the single most common application used on the Web. The attraction of hundreds of 
millions of searches per day provide significant incentive to content providers to do whatever nec- 
essary to rank highly in search engine results. The use of techniques that push rankings higher 
than they belong is often called spamming a search engine. Such methods typically include textual 
as well as link-based techniques. Like e-mail spam, this is a form of adversarial information re- 
trieval; the conflicting goals of accurate results of search providers and high positioning by content 
providers provides an interesting and real-world environment to study techniques in optimization, 
obfuscation, and reverse engineering, in addition to the application of information retrieval and 
classification. 


The workshop solicits technical papers on any aspect of adversarial information retrieval on the 
Web. Particular areas of interest include, but are not limited to search engine spam, link-bombing, 
reverse engineering of ranking algorithms, advertisement blocking, and web content filtering. Papers 
addressing higher-level concerns (e.g., whether ‘open’ algorithms can succeed in an adversarial 
environment, whether permanent solutions are possible, etc.) are also welcome. 


AIRWeb ’05 is intended to bring together researchers and practitioners that are concerned with 
the on-going efforts in adversarial information retrieval on the Web. Workshop participants will 
hear peer-reviewed technical papers, but are also expected to contribute by helping to identify 
datasets and evaluation methodologies, and to provide feedback on how research in these areas can 
contribute to practice. 


WF07 | Innovations in Web Infrastructure (IWI) 
Simon Courtenage, University of Westminster 

Boris Galitsky, University of London - Birkbeck 

David Lewis, Trinity College Dublin 

Location: 204 


The World-Wide Web provides us with a distributed hyperlinked document repository, but un- 
derlying the infrastructure of the web is a communications infrastructure, which is responsible for 
implementing much of the structure of the document repository. For example, in the current web, 
when a user chooses to navigate from a web page, using a hyperlink, to another page, they set in 
motion in a request/response transaction between their web browser and a web server, acting in a 
client/server relationship, which implements that navigation. Recently, there has been increasing 
interest in innovative network topologies such as peer-to-peer (structured and unstructured) which 
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decentralizes network control, and communications paradigms, such as content-based networking, 
as well as publish/subscribe which decouples producers and consumers of information and provide 
asynchronous as well as synchronous information delivery. Yet there is little focus on how this 
research can benefit the web. At the same time, from the perspective of the web, there has been 
tremendous interest in extending the infrastructure of the web, for example, through the use of 
ontologies to structure knowledge, and through the study of web topology and its influence on 
web search, virtual communities, collaborations and distributed information delivery. Yet there 
has been little focus on how advances in communications and networking can contribute to this 
research. Many open research problems exist in this area, such as semantic interoperability and 
the scalability of ontology-based reasoning within distributed knowledge environments, which re- 
quire contributions from the communications and networking community in order to advance robust 
solutions. 


IWI will tackle this problem by providing a forum within which web infrastructure topics can be 
discussed in relation to communications and networking, and similarly, advances in networking can 
be discussed in relation to their impact on the infrastructure of the web. A possible list of workshop 
topics would therefore include (but not be limited to): 


e Ontology-based routing by content; 


Meta-data management in P2P networks; 
e Communications support for distributed reasoning; 


e Web topologies and distributed agents; 


Content-based networking for distributed collaboration and virtual communities; 


Decentralized access control and trust. 


WFO8| Web Service Semantics: Towards Dynamic Business Integration 
Christoph Bussler, National University of Ireland, Galway 

Richard Goodwin, IBM T. J. Watson Research Center 

Rubén Lara, Digital Enterprise Research Institute 

David Martin, SRI International 

Takahira Yamaguchi, Keio University 

Location: 303 


The description of Web services in a machine-understandable fashion is expected to have a great 
impact in the areas of e-Commerce and Enterprise Application Integration, as it can enable dy- 
namic and scalable cooperation between independently developed systems and organisations. These 
potential benefits have led to the establishment of an important class of research activities, both 
in industry and academia, aimed at the practical deployment of declarative, semantically rich ser- 
vice and process descriptions and their use across the Web service lifecycle. This research, which 
draws on a variety of fields such as Knowledge Representation, Automated Software Engineering, 
Process Modeling, Workflow, and Software Agents, goes under the heading of Semantic Web Ser- 
vices (SWS). We note that here, “Semantic Web” does not denote any particular set of standards 
or commitment to any particular vision regarding the future of the Web. In addition many SWS 
efforts are aligned with rapidly developing commercial Web Service standards such as WSDL and 
UDDI. 
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Many major challenges need to be addressed in this field. This workshop aims to provide a fo- 
rum in which to focus on selected core technical challenges for deployment of SWS, and reach a 
better understanding of the relationships between commercial Web service standards, current SWS 
research efforts, and the ultimate requirements for full-scale deployment of these technologies. An- 
other major focus will be on the relationship of work on SWS to the needs of business systems, and 
in particular the needs having to do with publishing policies associated with Web services, such 
as those discussed at the recent W3C Workshop on Constraints and Capabilities for Web Services 
(see http: //www.w3.org/2004/06/ws-cc-cfp.html). We will particularly seek submissions that 
demonstrate innovative application of SWS technologies to the challenges involved in automating 
online business transactions. 


WFO9| Policy Management for the World Wide Web 
Tim Finin, University of Maryland 

Jim Hendler, University of Maryland 

Lalana Kagal, University of Maryland 

Location: 205 


In order to realize the full potential of the World Wide Web as an open, dynamic, and distributed 
“universe of network-accessible information”, it is important for web entities to behave appropri- 
ately. Policy management provides the openness, flexibility, and autonomy required to regulate 
this environment as entities can reason over their own policies and the policies of other entities to 
decide how to behave. Using policies also allows entities to specify expected behavior of entities 
they interact with. Entities can also adapt to increasingly complex requirements without the need 
for substantial changes to the structure or implementation through the use of policies. 


Policy management includes policy specification, deployment, reasoning over policies, updating and 
maintaining policies, and enforcement. We propose that policy management is required for the web 
for (i) constraining different kinds of behavior including security, privacy, conversation, and col- 
laboration, (ii) configuration management, (iii) describing business processes, and (iv) establishing 
trust and reputation. 


WF10| 2nd Annual Workshop on the Weblogging Ecosystem — Aggrega- 
tion, Analysis and Dynamics 

Natalie Glance, Intelliseek Applied Research Center 

Matthew Hurst, Intelliseek Applied Research Center 

Eytan Adar, Hewlett Packard Labs 

Location: 304 


The weblogging microcosm has evolved into a distinct form, into a community of publishers. The 
strong sense of community amongst bloggers distinguishes weblogs from the various forms of online 
publications such as online journals, magazines and newsletters that flourished in the early days 
of the web and from traditional media such as newspapers, magazines and television. The use of 
weblogs primarily for publishing, as opposed to discussion, differentiates blogs from other online 
community forums, such as Usenet newsgroups and message boards. Often referred to as the 
blogsphere, the network of bloggers is a thriving ecosystem, with its own internally driven dynamics. 
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The cross-linking that takes place between blogs, through blogrolls, explicit linking, trackbacks, 
and referrals creates implicit and explicit networks which define the communities of the weblogging 
world. create a strong sense of community in the weblogging world. There is work underway to un- 
derstand the dynamics of the weblogging network, much of which springs from bloggers themselves. 
The self-publishing aspect of weblogs, the time-stamped entries, the highly interlinked nature of 
the blogging community and the significant impact of weblog content on politics, ideas, and culture 
make them a fascinating subject of study. 


The objective of this workshop is to provide a forum for sharing research on the blogging ecosystem. 
The workshop will consist of technical papers, panel discussions, and demonstrations of research 
prototypes. 


WF11| Activities on Semantic Web Technologies in Japan 
Noboru Shimizu, Keto Research Institute 

Hideaki Takeda, National Institute of Informatics 

Location: Hotel New Otani, Rindo (East) 


The Semantic Web is a new Web technology that has potentiality of innovating the existing infor- 
mation society. In Japan, research institutes and industries are advancing various research projects 
on the Semantic Web and developing various practical applications. 


In this workshop, each of presenters will speak about outlines of their research projects or practical 
applications on the Semantic Web in Japan, including some demonstrations of software. One of 
the purposes of the workshop is introducing Japanese activities in the Semantic Web field to many 
other country’s participants, as the host country. 


WF12| MobEA III - Customer Focused Mobile Services 
Johan Hjelm, Fricsson 

Annakaisa Hayrynen, Elisa Communication Research Center 

Bin Wei, AT&T Shannon Laboratory 

Rittwik Jana, AT&T Labs — Research 

Location: Hotel New Otani, Rindo (West) 


We are in the midst of a mobile revolution. In order to realize the vision of pervasive mobile 
computing, the services provided have to be adapted to the users wants and needs. To do this, we 
need to go beyond technology, and understand the human-centric aspects of mobile computing. The 
objective of this workshop is to provide a single forum for researchers and technologists to discuss 
the state-of-the-art, present their contributions, and set future directions in emerging innovative 
applications for mobile wireless access. 


Topics of interest for technical papers include, but are not limited to the following: 
e Mobile web usage analysis 
e Peer-to-peer mobile computing 
e Security of mobile applications 
e Methods for measuring mobile application usage 


Models and methods for qualitative analysis of applications usage 


User interface for mobile devices 
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Multimedia applications 

Enterprise applications 

Open-standards and applications 
Performance studies of mobile applications 
Context-A ware services and applications 


Mobility issues of web services 
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Tuesday — Full Day Tutorials 


TITLE PRESENTERS LOCATION 
TFO1 | Network and Web Services Security | “Rags” Srinivas Sumire 
concepts using Java 
TFO3 | Web Engineering Yogesh Deshpande and Mar- Yuri 
tin Gaedke 
TFO5 | Internationalizing Web Content and | Richard Ishida and Martin Suisen 
Web Technology Dürst 


TF01| Network and Web Services Security concepts using Java 
Raghavan “Rags” Srinivas, Technology Evangelist, Sun Microsystems 
Location: Hotel New Otani, Sumire 


Network and web services security concepts are fairly straightforward and simple to understand 
from a developer viewpoint, especially in conjunction with some working code that can be deployed 
on the Java platform and security tools that are generally available. 


Attend this session to put into practice some of the concepts of security that you’ve heard or learnt 
and how to connect those dots to help in the implementation of real-life solutions. The session will 
walk through generating digests, signatures, generating and using keys and certificates to advanced 
concepts such as using Advanced Encryption Standards (AES). The newer concepts of web services 
security will be covered as well. 


TF03| Web Engineering: Developing Successful Web Applications In A 
ystematic Way 

Yogesh Deshpande, University of Western Sydney 

Martin Gaedke, University of Karlsruhe 

Location: Hotel New Otani, Yuri 


The Web environment is characterised by millions of Web sites and thousands of Web-based appli- 
cations. The numbers will continue to grow as more and more countries and organizations adopt 
and adapt to the Web. Good Web development requires understanding of numerous issues and 
strategies that span many disciplines, both computing and noncomputing. However, there are very 
few standard methods for the Web developers to use. To add to the complexity, user expectations 
and needs change over time. Web technologies and standards also continue to evolve. Consequently, 
even the successful Web sites and applications require constant attention and modifications that are 
best described more as evolution than just maintenance, as understood in software development. 
Hence, there is a strong need to understand and undertake Web engineering. Engineering has tra- 
ditionally addressed the issues of process management and product development, adapting them to 
the local environment or users as needed. Web development is truly global in its scope, as implied 
by the W3C’s Initiatives and Working Groups on personalisation, internationalisation and device 
independence, among others. Web Engineering reflects this global perspective in a systematic and 
multidisciplinary way. This tutorial will cover the issues of process management and product devel- 
opment in developing large Web sites and applications. It will analyse and highlight the challenges 
posed by the global perspective and present strategies that developers could follow for successful 
Web application development. There will be an extensive use of case studies throughout. 
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TFO5 | Internationalizing Web Content and Web Technology 
Martin Diirst, Aoyama Gakuin University 

Richard Ishida, World Wide Web Consortium 

Location: Hotel New Otani, Suisen 


Internationalization of Web content and Web technology means dealing with the world-wide vari- 
ation in language, script, and culture. This tutorial starts with an introduction to writing system 
characteristics and how they affect Web technology. Next, character encoding is discussed in de- 
tail, with a focus on Unicode/ISO 10646 and encodings used in Asia. This leads to the model for 
character encoding on the Web, which is common to formats such as HTML, XHTML, XML, CSS, 
and RDF, and practical advice for document encoding and labelling. 


Besides internationalized content, we discuss International Domain Names (IDN) and International 
Resource Identifiers (IRI), two new technologies for making the Web experience more seamless 
for non-English and non-Latin users. We then continue with international markup, including lan- 
guage and bidirectional markup, international rendering and styling, including recent work on CSS3 
focused on the needs of Asia, and international processing, including XSLT, XQuery, and Web Ser- 
vices. 
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Tuesday — Morning Tutorials 


TITLE PRESENTERS LOCATION 
TAO3 | Standards-Based Design Eric Meyer 101B 
TAOS | Introduction to RDF Query with | Dave Beckett, Steve Har- 101A 
SPARQL ris, Eric Prud'hommeaux and 
Andy Seaborne 
TAO7 | Web Content Mining Bing Liu 301A 


TA03 | Standards-Based Design 
Eric A. Meyer, Complex Spiral Consulting 
Location: 101B 


The overall goal of this tutorial is to make attendees familiar with the current state of standards- 
oriented design and to improve their skills in this area. It will not spend time on ’selling’ the benefits 
of such an approach, but will instead focus on how designers can more easily attain those benefits 
in the real world. 


The tutorial will be split into four subtopics, each taking up about a quarter of the time available. 
The subtopics are: creating a development environment for free; the pros and cons of specific CSS 
design techniques; recent advances that improve standards support and counter CSS limitations; 
and current trends in standards-oriented design. The session will be interactive, with audience 
questions and observations very much encouraged. 


TAO5 | Introduction to RDF Query with SPARQL 
Dave Beckett, University of Bristol 

Steve Harris, University of Southampton 

Eric Prud'hommeaux, World Wide Web Consortium 

Andy Seaborne, Hewlett-Packard Laboratories, Bristol 

Location: 101A 


SPARQL is the query language and protocol for RDF being designed by the W3C. Around May 
2005 the plan is that the work will be in its final stages (at Last Call stage) and that several 
compatible implementations will be shipping products supporting it. 


The purposes of this tutorial are to introduce SPARQL, to explain its benefits for querying RDF 
over other approaches to enable easy access to manipulating RDF data. 


We will demonstrate how SPARQL can be used to significantly simplify the development of semantic 
web applications enabling easy reuse of existing RDF data as well as building new RDF data services. 


The tutorial is divided into two sections. In the first section, we give a overview of SPARQL’s key 
features in accessing RDF, constraining it and producing result formats. By the end of the first 
section, the attendees should be able to write simple queries for extracting RDF data. In the second 
section, we propose to focus on applying SPARQL in the development of a small Semantic Web 
application, taking RDF from a number of data services, matching and transforming and using it 
to generate a variety of outputs including more RDF, XML and via XML transformations, HTML. 


The tutorial will be delivered as slides with demonstrations of SPARQL queries done using web 
forms in the browser and on the command line. These will talk to servers providing SPARQL 
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query over example data. Some software API work may be shown in Python, Java and possibly C 
depending on available of implementations in May 2005. 


This tutorial will be presented at a level accessible to Web programmers, advanced developers and 
experienced students. 


TAO7| Web Content Mining 
Bing Liu, University of Illinois at Chicago 
Location: 301A 


Web mining aims to develop a new generation of tools and techniques to effectively extract and/or 
mine useful information or knowledge from the Web. It consists of Web usage mining, Web structure 
mining, and Web content mining. Web usage mining refers to the discovery of user access patterns 
from Web usage logs. Web structure mining tries to discover useful knowledge from the structure 
of hyperlinks. Web content mining aims to extract/mine useful information or knowledge from web 
page contents. 


In this tutorial, we focus on Web content mining. In the past few years, there was a rapid expan- 
sion of activities in the Web content mining area. This tutorial will introduce the main mining 
tasks/problems and state of-the-art existing techniques for solving these problems. Topics include: 
data/information extraction, mining the Web to build concept hierarchies or ontology, mining for 
Web information integration, segmenting Web pages and detecting noise, mining online opinion 
sources such as reviews and forums, etc. All these tasks and their associated techniques have 
immediate applications in the real world. 


The tutorial will have many examples to help participants to better understand the concepts and 
techniques, and also to illustrate how they can be deployed in practice to help businesses. All parts 
of the tutorial will have a mix of research and industry flavor, addressing seminal research concepts 
and also looking at the technology from an industry point of view. Thus, apart from researchers 
and graduate students, we particularly encourage practitioners from industry to participate. 
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Tuesday — Afternoon Tutorials 


TITLE PRESENTERS LOCATION 
TPO1 | Basis for Automatic Web Service | Giuseppe De Giacomo, 101A 
Composition Daniela Berardi and Massimo 
Mecella 
TPO6 | Matching Words and Pictures — | Latifur Khan 301A 
Problems, Applications and Progress 
TPO7 | Location-based Services in Mobile In- | Ling Liu 101B 
formation Systems — Architectures, 
Description, and Systems 


TPO1| Basis for Automatic Web Service Composition 
Giuseppe De Giacomo, Universitá di Roma “La Sapienza” 

Daniela Berardi, Universita di Roma “La Sapienza” 

Massimo Mecella, Universitá di Roma “La Sapienza” 

Location: 101A 


The tutorial aims at providing a deep comprehension of Web Service Composition problem and 
automated techniques to tackle it. Web Service Composition is currently one the most hyped and 
addressed issue in the Service Oriented Computing. Starting from an analysis of current technologies 
and standards for Web Service composition, the tutorial will lead the attendees to consider formal 
models at the base of current proposals, and techniques that can be fruitfully considered to address 
automatic composition synthesis in each of them. 


TP06 | Matching Words and Pictures: Problems, Applications and Progress 
Latifur Khan, University of Texas at Dallas 
Location: 301A 


The development of technology generates huge amounts of non-textual information, such as images. 
An efficient image annotation and retrieval system is highly desired. Clustering algorithms make 
it possible to represent visual features of images with finite symbols. Based on this, many statisti- 
cal models, which analyze correspondence between visual features and words and discover hidden 
semantics, have been published. These models improve the annotation and retrieval of large image 
databases. However, image data usually have a large number of dimensions. Traditional cluster- 
ing algorithms assign equal weights to these dimensions, and become confounded in the process of 
dealing with these dimensions. 


In this tutorial, first, we will present current state of the art and their shortcomings. Second, we 
will present weighted feature selection algorithm as a solution to the existing problem. For a given 
cluster, we determine relevant features based on histogram analysis and assign greater weight to 
relevant features as compared to less relevant features. Third, we will exploit spatial correlation 
to disambiguate visual features, and spatial relationship will be constructed by spatial association 
rule mining. Fourth, we will demonstrate various models including current state of the art to link 
visual tokens with keywords based on the clustering results of K-means algorithm with weighted 
feature selection and without feature selection, and will evaluate performance using precision, recall 
and correspondence accuracy using benchmark dataset. Fifth, we will show that weighted feature 
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selection is better than traditional ones for automatic image annotation and retrieval. Finally, we 
will discuss open problems and future directions in the domain of image and video. 


TP07 | Location-based Services in Mobile Information Systems: Architec- 
tures, Description, and Systems 

Ling Liu, Georgia Institute of Technology 

Location: 101B 


With the growing market of sensing and positioning technologies and the growing popularity and 
availability of mobile communications, location-based information management has become an im- 
portant problem in mobile computing systems. Furthermore, the computational capabilities in 
mobile devices, ranging from navigational systems in cars to hand-held devices and cell phones, 
continue to rise, making mobile devices increasingly accessible. However, significant research efforts 
to date have been devoted to location management techniques and location-based services in cen- 
tralized location monitoring systems. Very few have studied the distributed approach to real-time 
location monitoring. We argue that for mobile applications that need to manage a large and grow- 
ing number of mobile objects, the centralized approaches do not scale well in terms of server load 
and network bandwidth, and are vulnerable to single point of failure. 


This tutorial presents the necessary concepts, architectures, techniques, and infrastructure to un- 
derstand Location-based Services in mobile information systems. The tutorial is designed to be 
self-contained, and gives the essential background for anyone planning to learn about and contribute 
to the principles and applications of location-based services in mobile commerce and geographical 
information systems. It guides practitioners by highlighting best practices in location based in- 
formation monitoring and introduces students and advanced developers to design and engineering 
issues in building scalable and privacy-aware distributed location based services, including the key 
trade-offs, as well as the limitations of current approaches. This tutorial is presented at a senior 
or beginning graduate student level. It is accessible to Web programmers, advanced application 
developers, and graduate students. 
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9:00 — 9:30 


Opening Ceremony 
Location: Convention Hall 


9:30 — 10:30 
Keynote Speech 


Location: Convention Hall 


Session Chair: Allan Ellis, Southern Cross University 


Room 9:00 - 9:30 9:30 - 10:30 13:30 - 14:30 
Hall Opening Ceremony Keynote: Tim Berners-Lee, Keynote: Yuji Inoue, NTT 
World Wide Web Consortium 
Room 10:50 - 12:20 14:40 - 16:10 16:30 - 18:00 
301 PTO1 | Usage Analysis PTO4 | Semantic Querying PTO7 | Semantic Web 
302 PTO2 | Wide-area Architectures PTO5 | Web Services PTO8 | Applications 
and Protocols 
303 PTO3 | Data Extraction PTO6 | Web Application Design PTO9 | Indexing and Querying 
304 ITO1 | Industrial and Practical ITO2 | Industrial and Practical ITO3 | Industrial and Practical 
Experience Track (Paper 1) Experience Track (Invited 1) Experience Track (Invited 2) 
201 PANELO1 | Can semantic web be | PANELO2 | Current trends in the | PANELO3 | Do we need more web 
made to flourish? integration of search and performance research? 
browsing 
ICR W3C01 | Enabling the Mobile W3C02 | Accessibility Aspects W3C03 | Foundations And Future 
Web within Mobile Web and Other Directions of Web Services 
Developing Technologies 
Room 18:30 - 20:30 
8 Poster Reception 


WWW at 15 Years: Looking Forward 
Tim Berners-Lee, World Wide Web Consortium 


The key property of the WWW is its universality: One must be able to access it whatever 
the hardware device, software platform, and network one is using, and despite the disabilities 
one might have, and whether one is in a “developed” or “developing” country; it must support 
information of any language, culture, quality, medium, and field without discrimination so that 
a hypertext link can go anywhere; it must support information intended for people, and that 
intended for machine processing. The Web architecture incorporates various choices which 
support these axes of universality. 


Currently the architecture and the principles are being exploited in the recent Mobile Web 
initiative in W3C to promote content which can be accessed optimally from conventional com- 
puters and mobile devices. New exciting areas arise every few months as possible Semantic 
Web flagship applications. As new areas burst forth, the fundamental principles remain im- 
portant and are extended and adjusted. At the same time, the principles of openness and 
consensus among international stakeholders which the WWW Consortium (W3C) employs for 
new technology are adjusted, but ever-important. 
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10:30 — 10:50 


Break 


10:50 — 12:20 Parallel Sessions 


PTO1 


Usage Analysis 


Location: Room 301 
Session Chair: Bing Liu, University of Illinois at Chicago 


Semantic Similarity Between Search Engine Queries Using Temporal Correlation 
Steve Chien, Microsoft Research 
Nicole Immorlica, MIT 


Duplicate Detection in Click Streams 

Ahmed Metwally*, Divyakant Agrawal’ and Amr El Abbadi' 
*UCSB, ValueClick, Inc. 

t UCSB 


Improving Recommendation Lists Through Topic Diversification 
Cai-Nicolas Ziegler". Sean M. McNee!, Joseph A. Konstant and Georg Lausen* 
* University of Freiburg, Germany 

t University of Minnesota, USA 


PTO2 


Wide-area Architectures and Protocols 


Location: Room 302 
Session Chair: Misha Rabinovich, AT&T Labs - Research 


GlobeDB: Autonomic Data Replication for Web Applications 

Swaminathan Sivasubramanian*, Gustavo Alonsot, Guillaume Pierre* and Maarten van Steen" 
* Department of Computer Science, Vrije Universiteit, Amsterdam 

t Department of Computer Science, Swiss Federal Institute of Technology (ETH), Zurich 


Hierarchical Substring Caching for Efficient Content Distribution to Low-Bandwidth 
Clients 
Utku Irmak and Torsten Suel, Polytechnic University 


Executing Incoherency Bounded Continuous Queries at Web Data Aggregators 
Rajeev Gupta*, Ashish Puri! and Krithi Ramamritham! 

*IBM, India Research Lab 

t Indian Institute of Technology, Bombay 


PTO3 


Data Extraction 


Location: Room 303 
Session Chair: Shinichi Morishita, University of Tokyo 


Fully Automatic Wrapper Generation for Search Engines 

Hongkun Zhao*, Weiyi Meng", Zonghuan Wut, Vijay Raghavan? and Clement Yu? 
*State University of New York at Binghamton 

t University of Louisiana at Lafayette 

t University of Illinois at Chicago 
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Web Data Extraction Based on Partial Tree Alignment 
Yanhong Zhai and Bing Liu, University of Illinois at Chicago 


Thresher: Automating the Unwrapping of Semantic Content from the World Wide 
Web 

Andrew Hogue, Google, Inc., MIT CSAIL 

David Karger, MIT CSAIL 


ITO1 | Industrial and Practical Experience Track (Paper 1) 
Location: Room 304 
Session Chair: Alex Arsky, Yahoo! 


A personalized search engine based on web-snippet hierarchical clustering 
P. Ferragina and Antonio Gull, Dipartimento di Informatica, Pisa 


Ranking Definitions with Supervised Learning Methods 
Jun Xu*, Yunbo Cao!, Hang Li! and Min Zhao? 

* Nankai University 

t Microsoft Research Asia 

t Chinese Academy of Sciences 

Identifying link farm spam pages 

Baoning Wu and Brian D. Davison, Lehigh University 
The Volume and Evolution of Web Page Templates 
David Gibson*, Kunal Punera! and Andrew Tomkins* 
*IBM Almaden Research Center 

t University of Texas at Austin 


PANELO1 | Can semantic web be made to flourish? 

Location: Room 201 

Moderator: David Wood, Software Memetics 

Panelists: Zavisa Bjelogrlic, Co-founder, Osemantics 
Bernadette Hyland, Co-founder, Tucana Technologies 
Prof. Jim Hendler, Director, MIND Lab, University of Maryland 
Kanzaki Masahide, Consultant, Kanzaki.com 


This panel's objective will be to discuss whether the Semantic Web can be made to grow in 
a “viral” manner, like the World Wide Web did in the early 1990s. The scope of the discus- 
sion will include efforts by the World Wide Web Consortium's Semantic Web Best Practices 
& Deployment Working Group to identify and publish best practices of Semantic Web prac- 
titioners, and the barriers to adoption of those practices by a wider community. The concept 
of “best practices” as it applies to a distributed, diverse and partially-defined Semantic Web 
will be discussed and its relevance debated. Specifically, panelists will discuss the capability of 
standards bodies, commercial companies and early adopters to create a viral technology. 
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W3C01| Enabling the Mobile Web 
Location: International Conference Room 
Session Chairs: Dan Applequist, Vodafone 
Stéphane Boyera, W3C Device Independence Activity Lead 
Speakers: Dan Applequist, Vodafone 
Stéphane Boyera, W3C Device Independence Activity Lead 


The session’s goal is twofold: present in details the objectives and roadmap of W3C’s work in 
the Mobile Web area (in Mobile Web Best Practices and Device Description), and get feedback 
and input from Japanese companies involved in the mobile market in Japan, very advanced in 
terms of enabling and accessing the Web on mobile devices. 

The session will mainly consist on a panel discussion where all actors of the mobile delivery 
chain will be represented. 


12:20 — 13:30 
Lunch 
Location: Exhibition Hall 8 


13:30 — 14:30 
Keynote Speech 
Location: Convention Hall 

Session Chair: Tatsuya Hagino, Keio University 


Innovation for a Human-Centered Network 
Yuji Inoue, NTT 


This talk presents NTT’s approach for realizing a Human-Centered Network. Last November, 
we announced the NT'T Group's Medium-Term Management Strategy, which consists of three 
management objectives: (1) building the ubiquitous broadband market and helping achieve 
the e-Japan Strategy and the u-Japan Initiative; (2) building a safe, secure, and convenient 
communications network environment and broadband access infrastructure, while achieving a 
seamless migration from the legacy telephone network to the next generation network; and (3) 
striving to increase corporate value and achieve sustainable growth. Since the management 
strategy takes account of Japan's future social issues such as declining birthrate and aging 
population, the need to reduce the environmental load, etc, we believe that the R&D activi- 
ties directed towards accomplishing these objectives consequently lead to the realization of a 
Human-Centered Network. 


14:40 — 16:10 Parallel Sessions 


PTO4 | Semantic Querying 
Location: Room 301 
Session Chair: Andrew Tomkins, IBM Almaden Research Center 
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Ranking a Stream of News 

Gianna M. Del Corso*, Antonio Gullí? and Francesco Romani* 

*Dip. di Informatica, Pisa, Italy 

t Dip. di Informatica, Pisa, Italy and IIT-CNR 

Algorithmic Detection of Semantic Similarity 

Ana Maguitman, Filippo Menczer, Heather Roinestad and Alessandro Vespignani, Indiana 
University 


SemRank: Ranking Complex Relationship Search Results on the Semantic Web 
Kemafor Anyanwu, Angela Maduko and Amit Sheth, University of Georgia 


PTO5 


Web Services 


Location: Room 302 
Session Chair: Arnaud Sahuguet, Bell Labs 


A Service Creation Environment based on End to End Composition of Web Ser- 
vices 

Vikas Agarwal, Koustuv Dasgupta, Neeran Karnik, Arun Kumar, Ashish Kundu, Sumit Mittal 
and Biplav Srivastava, IBM India Research Lab 


Ensuring Required Failure Atomicity of Composite Web Services 
Sami Bhiri, Olivier Perrin and Claude Godart, LORIA-INRIA France 


Web Service Interfaces 

Dirk Beyer, EPFL, Lausanne, Switzerland 

Arindam Chakrabarti, University of California, Berkeley 
Thomas A. Henzinger, EPFL Lausanne, CH & UC Berkeley 


PTO6 


Web Application Design 


Location: Room 303 
Session Chair: Geert-Jan Houben, Eindhoven University of Technology 


Building Adaptable and Reusable XML Applications with Model Transformations 
Ivan Kurtev and Klaas van den Berg, Software Engineering Group, University of Twente 


Exception Handling in Workflow-Driven Web Applications 

Marco Brambilla, Stefano Ceri, Sara Comai and Christina Tziviskou, Politecnico di Milano 
AwareDAV: A Generic WebDAV Notification Framework and Implementation 
Henning Qin Jehgj, Niels Olof Bouvin and Kaj Grønbæk, Department of Computer Science, 
University of Aarhus 


ITO2| Industrial and Practical Experience Track (Invited 1) 
Location: Room 304 
Session Chair: Kazuo Iwano, IBM Japan 


DoCoMo’s Challenge Towards New Mobile Services 
Kiyoyuki Tsujimura, NTT DoCoMo 


Automatic Text Processing to Enhance Product Search for On-line Shopping 
Gilles Vandelle, Kelkoo 
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Approach and the problem of business of NHN Group in South Korea, Japan, and 
China 
TBD, NHN Corporation Group 


PANELO2 | Current trends in the integration of search and browsing 
Location: Room 201 
Moderators: Andrei Z Broder, IBM T.J. Watson Research Center, USA 
Yoelle S. Maarek, IBM Research, Israel 
Panelists: Krishna Bharat, Principal Scientist, Google Inc. 
Susan Dumais, Senior Researcher, Microsoft Research 
Steve Papa, Founder and CEO, Endeca 
Jan Pedersen, Chief Scientist, Yahoo Inc. 
Prabhakar Raghavan, Senior Vice President and CTO, Verity, Inc. 


Searching and browsing are the two basic information discovery paradigms, since the early days 
of the Web. After more than ten years down the road, three schools seem to have emerged: (1) 
'The search-centric school argues that guided navigation is superfluous since free form search 
has become so good and the search UI so common, that users can satisfy all their needs via 
simple queries (2) The taxonomy navigation school claims that users have difficulties expressing 
informational needs and (3) The meta-data centric school advocates the use of meta-data for 
narrowing large sets of results, and is successful in e-commerce where it is known as “multi 
faceted search”. This panel brings together experts and advocates for all three schools, who 
will discuss these approaches and share their experiences in the field. We will ask the audience 
to challenge our experts with real information architecture problems. 


W3C02| Accessibility Aspects within Mobile Web and Other Developing 
echnologies 
Location: International Conference Room 
Session Chair: Shawn Henry, W3C WAI Activity Team 
Speakers: Shawn Henry, W3C WAI Activity Team 
Wendy Chisholm, W3C 


In the first part of this session we explain the interdependencies between essential components of 
Web accessibility, and show that the responsibility for Web accessibility goes beyond the content 
developer to include developers of authoring tools, user agents, assistive technologies, and 
technical specifications. We provide a brief update on WCAG 2.0, ATAG 2.0, and international 
Web accessibility developments. 


In the second part, we explore how the knowledge and experience in Web accessibility helps 
inform the development of emerging Web technologies, including the mobile Web, multimodal 
interaction, and content adaptation. 


16:10 — 16:30 
Break 
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16:30 — 18:00 Parallel Sessions 
PTO7 | Semantic Web 


Location: Room 301 
Session Chair: Jim Hendler, University of Maryland 


Learning Domain Ontologies for Web Service Descriptions: an Experiment in 
Bioinformatics 

Marta Sabou*, Chris Wroe!, Carole Goble’ and Gilad Mishne? 

* Vrije Universiteit Amsterdam 

t University of Manchester 

t University of Amsterdam 


Making RDF Presentable — Selection, Structure and Surfability for the Semantic 
Web 
Lloyd Rutledge, Jacco van Ossenbruggen and Lynda Hardman, CWI 


PTO8 | Applications 
Location: Room 302 
Session Chair: Jonathan Trevor, Fuji-Xeroz Palo Alto Laboratory 


Shared Lexicon for Distributed Annotations on the Web 
Paolo Avesani and Marco Cova, ITC-irst 


Using XForms to Simplify Web Programming 
Richard Cardone, Danny Soroker and Alpana Tiwari, IBM Watson Research Center 


Web-Assisted Annotation, Semantic Indexing and Search of Television and Radio 
News 

Mike Dowman*, Valentin Tablan*, Hamish Cunningham* and Borislav Popov! 

* University of Sheffield 

t Ontotext Lab, Sirma AI EAD 


PTO9 | Indexing and Querying 
Location: Room 303 
Session Chair: Frank McSherry, Microsoft 


Improving Web Search Performance Via a Locality Based Static Pruning Method 
Edleno S. de Moura", Célia Francisca dos Santos*, Daniel R. Fernandes", Altigran S. da Silva*, 
Pável P. Calado? and Mario Nascimento? 

* Federal University of Amazonas 

TINESC 

t University of Alberta 


Sampling Search- Engine Results 

Aris Anagnostopoulos, Brown University 
Andrei Broder, IBM Watson Research Lab 
David Carmel, IBM Research Lab in Haifa 


Three-Level Caching for Efficient Query Processing in Large Web Search Engines 
Xiaohui Long and Torsten Suel, Polytechnic University 
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ITO3| Industrial and Practical Experience Track (Invited 2) 
Location: Room 304 
Session Chair: Naohiko Uramoto, IBM Tokyo Research Laboratory 


Internet Search Engines: Past and Future 
Jan Pedersen, Yahoo! 

News in the Age of the Web 

Krishna Bharat, Google 


Technical Challenges in Exploiting the Web as a Business Resource 
Andrew Tomkins, IBM 


PANELO3 | Do we need more web performance research? 
Location: Room 201 
Moderator: Michael Rabinovich, AT&T Labs - Research, USA 
Panelists: Giovanni Pacifici, IBM Watson Research Center, USA 
Michele Colajanni, University of Modena, Italy 
Krithi Ramamritham, IT Bombay, India 
Bruce Maggs, CMU/Akamai, USA 


This panel will discuss the future and purpose of Web performance research, concentrating on 
the reasons for modest success in the adoption of research results in practice. The panel will 
in particular examine factors that hinder technology transfer in the Web performance area, 
consider examples of past successes and failures in this arena, and stimulate the discussion on 
how to make Web performance research more relevant. 


W3C03| Foundations And Future Directions of Web Services 
Location: International Conference Room 
Session Chair: Hugo Haas, W3C Web Services Activity Lead 
Speakers: Hugo Haas, W3C Web Services Activity Lead 
Charlton Barreto, webMethods 


This session will give an overview of the motivation for Web services,how the technologies 
standardized at W3C fit together, starting with the messaging framework (SOAP 1.2, MTOM, 
WS-Addressing 1.0) and continuing with the description languages for services and choreogra- 
phies (WSDL 2.0, WS-CDL 1.0). Finally, this presentation will discuss future work considered 
to continue making the Web services” promise a reality. 


18:30 — 20:30 
Poster Reception 
Location: Exhibition Hall 8 
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Room 9:00 - 10:00 


13:30 - 14:30 


Berkeley 


Hall Keynote: Eric Brewer, 
University of California at 


Keynote: Lorrie Cranor, 
Carnegie Mellon University 


Room 10:30 - 12:00 


14:40 - 16:10 


16:30 - 18:00 


301 PT10 | XML Query and 


Programming Languages 


PT13 | Web Engineering with 
Semantic Annotation 


PT16 | Semantic Search 


302 PT11 | Web-based Educational 
pplications 


PT14 | User-focused Search and 
Crawling 


PT17 | Security Through the 
Eyes of Users 


Extraction 


303 PT12 | Text Analysis and 


PT15 | Trustworthy web sites 


PT18 | Measurements and 
analysis 


304 IT04 | Industrial and Practical 
Experience Track (Paper 2) 


14:40 - 17:00 


Services 


201 PANELO4 | Mobile Multimedia 


world-wide information 
society: Toward the knowledge 
society - the challenge 


ICR W3C04 | Privacy and the 
Semantic Web 


ITO5 | Industrial and Practical Experience Track (Panel) 
PANELO5 | On culture in a PANELO6 | Exploiting the 


dynamic networking effects of 
the web 


W3C05 | Recent Work in the 
Semantic Web Activity: Query 
and Best Practices 


W3C06 | Web 
Internationalization 
Developments 


Room 18:30 - 20:30 


Tsuru Conference Dinner 


9:00 — 10:00 
Keynote Speech 


Location: Convention Hall 


Session Chair: Prabhakar Raghavan, Verity, Inc. 


The Case for Technology for Developing Regions 
Eric Brewer, University of California, Berkeley 


Moore's Law and the wave of technologies it enabled have led to tremendous improvements 
in productivity and the quality of life in the industrialized world. Yet, technology has had 
almost no effect on the four billion people that make less US$2000/day. In this talk I argue 
that the decreasing costs of computing and wireless networking make this the right time to 
spread the benefits of technology, and that the biggest missing piece is a lack of focus on 
the problems that matter, including health, education, and government. After covering some 
example applications that have shown very high impact, I take an early look at the research 
agenda for developing regions. Finally, I examine some of the pragmatic issues required to make 
progress on these very challenging problems. My goal is to convince high-tech researchers that 


technology for developing regions is an important and viable research topic. 


10:00 — 10:30 
Break 
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10:30 — 12:00 Parallel Sessions 
PT10| XML Query and Programming Languages 


Location: Room 301 
Session Chair: Jim Webber, University of Sydney 


Sub-document queries over XML with XSQuirrel 
Arnaud Sahuguet, Bell Labs research 
Bogdan Alexe, Ecole Polytechnique + ENST 


XJ : Facilitating XML Processing in Java 

Matthew Harren*, Mukund Raghavachari?, Oded Shmueli*, Michael Burke’, Rajesh Bor- 
dawekar!, Igor Pechtchanski? and Vivek Sarkar? 

*University of California, Berkeley 

'IBM T.J. Watson Research Center 

tł Technion Israel Institute of Technology 


XQuery Containment in Presence of Variable Binding Dependencies 
Li Chen, San Diego Supercomputer Center 
Elke Rundensteiner, Worcester Polytechnic Institute 


PT11| Web-based Educational Applications 
Location: Room 302 
Session Chair: Lora Aroyo, Eindhoven University of Technology 


eBag - a Ubiquitous Web Infrastructure for Nomadic Learning 

Christina Brodersen*, Bent Guldbjerg Christensen*, Kaj Gronbaek* and Christian Dindler! 
*Department of Computer Science, University of Aarhus 

t Department of Information- and Media Sciences, University of Aarhus 


Online Curriculum on the Semantic Web: The CSD-UoC Portal for Peer-to-Peer 
e-learning 

Sofia Pediaditaki, Apostolos Apostolidis and Dimitris Kotzinos, Department of Computer Sci- 
ence, University of Crete 

The Classroom Sentinel: Supporting Data-Driven Decision Making in the Class- 


room 
Mark K. Singley and Richard B. Lam, IBM T.J. Watson Research Center 


PT12| Text Analysis and Extraction 
Location: Room 303 
Session Chair: Filippo Menczer, University of Indiana 


Topic Segmentation of Message Hierarchies for Indexing and Navigation Support 
Jong Wook Kim, K. Selcuk Candan and Mehmet E. Donderler, Arizona State University 
Gimme’ The Context: Context-driven automatic semantic annotation with C- 
PANKOW 

Philipp Cimiano*, Günter Ladwig* and Steffen Staab? 

* Institute AIFB, University of Karlsruhe 

t Institute for Computer Science, University of Knoblenz-Landau 
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Opinion Observer: Analyzing and Comparing Opinions on the Web 
Bing Liu, Minging Hu and Junsheng Cheng, University of Illinois at Chicago 


ITO4 


Industrial and Practical Experience Track (Paper 2) 


Location: Room 304 
Session Chair: Yuichi Nakamura, IBM Tokyo Research Laboratory 


The Infocious Web Search Engine: Improving Web Searching through Linguistic 
Analysis 
Alexandros Ntoulas, Gerald Chao and Junghoo Cho, Infocious Inc. 


How to make Web sites talk together - Web Service Gateway Solution 
Hoang Pham Huy*, Takahiro Kawamura! and Tetsuo Hasegawa! 

* Hanoi University of Technology 

t Toshiba RGD Center 


Diversified SCM Standard for the Japanese Retail Industry 
Koichi Hayashi, Naoki Koguro and Reki Murakami, UL Systems, Inc. 


Crawling a Country: Better Strategies than Breadth-First for Web Page Ordering 
Ricardo Baeza-Yates*, Carlos Castillo*, Mauricio Marin! and Andrea Rodriguez? 

* Universidad de Chile 

t Universidad de Magallanes 

t Universidad de Concepción 


PANELO4| Mobile Multimedia Services 


Location: Room 201 
Moderator: Behzad Shahraray, AT&T Labs - Research, USA 
Panelists: Wei-Ying Ma, Microsoft Research, USA 


Avideh Zakhor, University of California, Berkeley, USA 
Noboru Babaguchi, Osaka University, Japan 


'This panel will mainly focus on the role that media processing can play in creating mobile 
communications, information, and entertainment services. A major premise of our discussion 
is that media processing techniques go beyond compression and can be employed to monitor, 
filter, convert, and repurpose information. Such automated techniques can serve to create 
personalized information and entertainment services in a cost-effective way, adapt existing 
content for consumption on mobile devices, and circumvent the inherent limitations of mobile 
devices. Some examples of the applications of media processing techniques for mobile service 
generation will be given. 


W3C04 


Privacy and the Semantic Web 


Location: International Conference Room 
Session Chairs: Giles Hogben, Joint Research Center of the European Commission 


Thomas Roessler, W3C' Technology and Society Team 


Speakers: Giles Hogben, Joint Research Center of the European Commission 


Thomas Roessler, W3C Technology and Society Team 
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This session will explore privacy-enhancing technologies beyond P3P, how the semantic web is 
contributing to these technologies and vice-versa. 

We will discuss new technologies for enforcing privacy promises, evaluating the security of data 
processing operations in realtime and reasoning about anonymity of authorization credentials 
requested. The session will consist of presentations and a subsequent panel discussion. 


10:00 — 10:30 
Lunch 
Location: Exhibition Hall 8 


13:30 — 14:30 
Keynote Speech 
Location: Convention Hall 
Session Chair: Fred Douglis, IBM Research 
Towards Usable Web Privacy and Security 
Lorrie Cranor, Carnegie Mellon University 


Internet users now rely on a whole arsenal of tools to protect their security and privacy. Experts 
recommend that computer users install personal firewalls, anti-virus software, spyware blockers, 
spam filters, cookie managers, and a variety of other tools to keep themselves safe. Users are 
told to pick hard-to-guess passwords, use a different password at every Web site, and not to 
write any of their passwords down. They are told to read privacy policies before providing 
personal information to Web sites, look for lock icons before typing in a credit card number, 
refrain from opening email attachments from people they don’t know, and even to think twice 
about opening email attachments from people they do know. With so many do’s and don’ts, 
it is not surprising that much of this advice is ignored. In this talk I will highlight usability 
problems that make it difficult for people to protect their privacy and security on the Web, 
and I will discuss a number of approaches to addressing these problems. 


14:40 — 16:10 Parallel Sessions 


PT13| Web Engineering with Semantic Annotation 
Location: Room 301 
Session Chair: Piero Fraternali, Politecnico di Milano 


Accessibility: A Web Engineering Approach 

Peter Plessers*, Sven Casteleyn*, Yeliz Yesilada!, Olga De Troyer*, Robert Stevens!, Simon 
Harper! and Carole Goblet 

*Vrije Universiteit Brussel 

t School of Computer Science 

A Multilingual Usage Consultation Tool based on Internet Searching —More than 
search engine, Less than QA— 

Kumiko Tanaka-Ishii and Hiroshi Nakagawa, University of Tokyo 


Improving Portlet aggregation through deep annotation 
Oscar Díaz, Jon Iturrioz and Arantza Irastorza, Department of Computer Languages and Sys- 
tems, University of the Basque Country, Spain 
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PT14| User-focused Search and Crawling 
Location: Room 302 
Session Chair: Brian Davison, Lehigh 
CubeSVD: A Novel Approach to Personalized Web Search 
Jian-Tao Sun*, Hua-Jun Zeng!, Huan Liu, Yu-Chang Lu? and Zheng Chen! 
* Department of Computer Science, TsingHua University. 
t Microsoft Research Asia 
Department of Computer Science Engineering, Arizona State University 
5 Department of Computer Science, TsingHua University 
Automatic Identification of User Goals in Web Search 
Uichin Lee, Zhenyu Liu and Junghoo Cho, Computer Science Department, UCLA 


User-Centric Web Crawling 
Sandeep Pandey and Christopher Olston, CMU 


PT15| Trustworthy web sites 

Location: Room 303 

Session Chair: Oliver Spatscheck, AT&T Labs-Research 
An Abuse-Free Fair Contract Signing Protocol Based on the RSA Signature 
Guilin Wang, Institute for Infocomm Research, Singapore 


SGuard: Countering Vulnerabilities in Reputation Management for Decentralized 
Overlay Networks 
Mudhakar Srivatsa, Li Xiong and Ling Liu, College of Computing, Georgia Tech 


Static Approximation of Dynamically Generated Web Pages 
Yasuhiko Minamide, University of T'sukuba 


PANELO5 | On culture in a world-wide information society: Toward the 
knowledge society - the challenge 
Location: Room 201 
Moderator: Alfredo M. Ronchi, Politecnico di Milano, Milano Italy 
Panelists: Lynn Thiesmeyer, Keio University, Tokyo, Japan 
Antonella Quacchia, International Labor Office, Geneve, Swiss 
Georges Mihajes, Oslo Platform, Oslo, Norway 
Katsuhiro Onoda, Foundation for Computer & Communication Promotion, Japan 
Ranjit Makkuni, Sacred World Foundation - New Delhi - India 


Starting from more then ten years of experience and achievements in online cultural content, 
the panel aims to provide a comprehensive view on controversial issues, or unsolved problems, 
both in the WWW and Cultural community to stimulate lively, thoughtful, and sometimes 
provocative discussions. Panelists will outline the relevance of digital collections of intangible 
heritage and endangered archives and discuss the following topics: the “global” Web vs. the 
preservation of “local” cultural identities, cultural diversities and their relevance in delivering 
web based services, preservation & future of digital memories, Web-based development and 
sustainability models. We expect the panelists to actively engage the audience and help them 
broaden their understanding of the issues. 


URL: http://www.medicif.org/Events/ MEDICI events/W WW?2005/default.htm 
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W3C05| Recent Work in the Semantic Web Activity: Query and Best Prac- 

tices 

Location: International Conference Room 

Session Chair: Guus Schreiber, W3C Semantic Web Best Practices and Deployment Working 
Group co-chair (Vrije Universiteit) 

Speakers: Guus Schreiber, W3C Semantic Web Best Practices and Deployment Working 

Group co-chair (Vrije Universiteit) 
Jeremy Carroll, HP 
David Wood, Mindswap, co-chair SWBPD 
Eric Prud'hommeaux, W3C 


This session presents recent work in the W3C Semantic Web activity, in particular the activities 
of the Semantic Web Best Practices & Deployment Working Group and the Data Access 
Working Group. 

The session will feature a number of talks addressing selected topics from this work, namely (i) 
links between RDF/OWL and UML. (ii) support for RDF/OWL-based ontology engineering, 
(iii) representing RDF metadata in XHTML, and (iv) an overview of the draft query language 
SPARQL. The session will also show some sample applications of semantic-web technology. 


14:40 — 17:00 


ITO5| Industrial and Practical Experience Track Panel 
ow Search Engines Shape the Web 
Location: Room 304 
Moderator: Byron Dom, Yahoo! 
Panelists: Krishna Bharat, Google 
Andrei Broder, [BM 
Jan Pedersen, Yahoo! 
Yoshinobu Tonomura, NTT 
Marc Najork, Microsoft 


16:10 — 16:30 
Break 


16:30 — 18:00 Parallel Sessions 


PT16| Semantic Search 
Location: Room 301 
Session Chair: Junghoo (john) Cho, University of California, Los Angeles 


A Search Engine for Large-Corpus Language Applications 
Michael J. Cafarella and Oren Etzioni, University of Washington 


An Enhanced Model for Searching in Semantic Portals 

Lei Zhang*, Yong Yu*, Jian Zhou*, ChenXi Län" and David Y. Yang! 
* Shanghai JiaoTong University 

t Hong Kong University of Science and Technology 
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Disambiguating Web Appearances of People in a Social Network 
Ron Bekkerman and Andrew McCallum, Department of Computer Science, University of Mas- 
sachusetts 


PT17| Security Through the Eyes of Users 


Location: Room 302 
Session Chair: Mark Manasse, Microsoft Research 


A Convenient Method for Securely Managing Passwords 
J. Alex Halderman, Brent Waters and Edward W. Felten, Department of Computer Science, 
Princeton University 


Improving Understanding of Website Privacy Policies with Fine-Grained Policy 
Anchors 

Stephen E. Levy, Watson Research Center, IBM 

Carl Gutwin, Computer Science Department, University of Saskatchewan 


Hardening Web Browsers Against Man-in-the-Middle and Eavesdropping Attacks 
Haidong Xia and José Carlos Brustoloni, University of Pittsburgh 


PT18| Measurements and analysis 


Location: Room 303 
Session Chair: Prashant Shenoy, University of Massachusetts 


ATMEN: A Triggered Network Measurement Infrastructure 
Balachander Krishnamurthy*, Harsha V. Madhyastha! and Oliver Spatscheck* 
“AT&T Labs-Research 

t University of Washington 

On the lack of typical behavior in the global Web traffic network 
Mark Meiss, Filippo Menczer and Alessandro Vespignani, Indiana University 


Analysis of Multimedia Workloads with Implications for Internet Streaming 
Lei Guo*, Songqing Chen!, Zhen Xiao? and Xiaodong Zhang" 

* Department of Computer Science, College of William and Mary 

t Department of Computer Science, George Mason University 

t ATE T Labs-Research 


PANELOG 


Exploiting the dynamic networking effects of the web 


Location: Room 201 


Moderator: 
Panelists: 


Ramesh Sarukkai, Yahoo, USA 

Prof. Soumen Chakrabarthi, Professor, IIT Bombay 

Dr. Gary William Flake, Head of Research Labs, Yahoo! 
Dr. Narayanan Shivakumar, Director of Ad Systems, Google 
Prof. Asim M. Ansari, Professor, Columbia Business School 


'This panel aims to explore the dynamic networking effects of the Web. 'Today, linkages on the 
Web are augmented with dynamic connectivities based on various monetization strategies: e.g. 
ads and sponsored links. Such linkages change the dynamics of user click/flow on the Web. 
The key focus of this panel is to debate whether/how such dynamic effects on the Web can 
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be modeled and best exploited. How can we derive cooperative placement strategies that are 
optimal from a customer perspective? As the World Wide Web becomes more dynamic with 
fluid link placements guided by different factors, optimizing link placement in a cooperative 
fashion across the Web will be an integral and crucial component. 


URL: http://research.yahoo.com/workshops/www2005 /NetworkingEffectsWeb/ 


W3C06 | Web Internationalization Developments 

Location: International Conference Room 

Session Chair: Richard Ishida, W3C Internationalization Activity Lead 
Speaker: Richard Ishida, W3C Internationalization Activity Lead 


The W3C Internationalization Activity now comprises three Working Groups. This session 
brings you up to date with key areas of their work. 

Topics covered: recent clarifications by the GEO Working Group on language declaration and 
character encoding for Web documents; the issues before the newly formed Internationalized 
Tag Set Working Group; work by the Core Working Group on internationalized Web addresses; 
and related work on language tags. 


18:30 — 20:30 
Conference Dinner 
Location: Hotel New Otani, Tsuru 
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Room 9:00 - 10:00 17:10 - 18:00 
Hall Keynote: Rob Glaser, Closing Ceremony 
RealNetworks 
Room 10:30 - 12:00 13:30 - 15:00 15:30 - 17:00 
301 PT19 | Service selection and PT22 | Semantic Web PT25 | Schemas and Semantics 
Metadata Foundations 
302 PT20 | Link-based Ranking PT23 | Link-based Similarity PT26 | Architecture and 
Implementation of Web sites 
303 PT21 | Improving the Browsing PT24 | XML Parsing and PT27 | Embedded Web 
Experience Stylesheets 
201 Querying the past, Web engineering: PANELO9 | Web services 
present and future: where we technical discipline or social considered harmful? 
are and where we will be process 
ICR W3C07 | The Future of XML W3C08 | Interaction and the W3C09 | Questions & Answers to 
Web: The Future Browser the W3C Members and Team 


9:00 — 10:00 
Keynote Speech 

Location: Convention Hall 

Session Chair: Fred Douglis, IBM Research 


Real and the Future of Digital Media 
Rob Glaser, RealNetworks, Inc. 


10:00 — 10:30 

Break 

10:30 — 12:00 Parallel Sessions 
PT19| Service selection and Metadata 


Location: Room 301 
Session Chair: Philipp Cimiano, Institute AIFB, University of Karlsruhe 


On Optimal Service Selection 
P.A. Bonatti and P. Festa, University of Napoli Federico II 


G-ToPSS: Fast Filtering of Graph-based Metadata 
Milenko Petrovic, Haifeng Liu and Hans-Arno Jacobsen, University of Toronto 


Automating Metadata Generation: the Simple Indexing Interface 
Kris Cardinaels, Michael Meire and Erik Duval, Departement Computerwetenschappen, Katholieke 
Universiteit Leuven 
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PT20| Link-based Ranking 
Location: Room 302 
Session Chair: Torsten Suel, Polytechnic University, Brooklyn 


PageRank as a Function of the Damping Factor 

Paolo Boldi, Massimo Santini and Sebastiano Vigna, Universita degli Studi di Milano 
Object-Level Ranking: Bringing Order to Web Objects 

Zaiqing Nie". Yuanzhi Zhang! , Ji-Rong Wen* and Wei-Ying Ma* 

* Microsoft Research Asia, Beijing, China 

t Peking University, Beijing, China 

Unifying a Large Class of PageRank Optimizations 

Frank McSherry, Microsoft Research, SVC 


PT21| Improving the Browsing Experience 
Location: Room 303 
Session Chair: Yoelle Maarek, IBM Research Haifa 


Information Search and Re-access Strategies of Experienced Web Users 
Anne Aula, Natalie Jhaveri and Mika Kaki, University of Tampere 


Browsing Fatigue in Handhelds: Semantic Bookmarking Spells Relief 
Saikat Mukherjee and I.V. Ramakrishnan, Department of Computer Science, SUNY Stony 
Brook 


WebPod: Persistent Web Browsing Sessions with Pocketable Storage Devices 
Shaya Potter and Jason Nieh, Columbia University 


PANELO7 | Querying the past, present and future: where we are and where 
we will be 
Location: Room 201 
Moderator: Ling Liu, Georgia Institute of Technology, USA 
Panelists: Andrei Z Broder, IBM T.J. Watson Research Center, USA 
Dieter Fensel, Digital Enterprise Research Institute (DERI), Europe 
Carole Goble, University of Manchester, United Kingdom 
Christopher Olsen, MIT, USA 
Calton Pu, CERCS, Georgia Tech, USA 


This panel will focus on exploring future enhancements of Web technology for active Internet- 
scale information delivery and dissemination. It will ask the questions of whether the current 
Web technology is sufficient, what can be leveraged in this endeavor, and how a combination 
of ideas from a variety of existing disciplines can help in meeting the new challenges of large 
scale information dissemination. Relevant existing technologies and research areas include: 
active databases, agent systems, continual queries, event Web, publish/subscribe technology, 
sensor and stream data management. We expect that some suggestions may be in conflict with 
current, well-accepted approaches. 
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W3C07 | The Future of XML 
Location: International Conference Room 
Session Chair: Liam Quin, W3C XML Activity Lead 
Speakers: Liam Quin, W3C XML Activity Lead 
Makoto Murata, IBM Tokyo Research Lab 
Robin Berjon, Expway 


What should the W3C XML Activity be working on over the next few years? How might we 
incorporate efficient transfer of XML (e.g. binary XML) into the XML stack? Where should 
our major specifications be going? Should we work on new specifications? 


This is a community session: come prepared to voice a considered opinion and be heard. 


12:00 — 13:30 
Lunch 
Location: Exhibition Hall 8 


13:30 — 15:00 Parallel Sessions 


PT22| Semantic Web Foundations 
Location: Room 301 
Session Chair: Steffen Staab, University of Koblenz-Landau 


Named Graphs, Provenance and Trust 

Jeremy J. Carroll, HP Labs 

Christian Bizer, Free University of Berlin 

Pat Hayes, JHMC 

Patrick Stickler, Nokia 

OWL DL vs. OWL Flight: Conceptual Modeling and Reasoning for the Semantic 
Web 

Jos de Bruijn, Axel Pollered, Ruben Lara and Dieter Fensel, Digital Enterprise Research In- 
stitute (DERI) 


Debugging OWL Ontologies 
Bijan Parsia, Evren Sirin and Aditya Kalyanpur, University of Maryland, College Park 


PT23| Link-based Similarity 
Location: Room 302 
Session Chair: Marc Najork, Microsoft 


Scaling Link-Based Similarity Search 

Daniel Fogaras, Budapest University of Technology and Economics 

Balazs Racz, Computer and Automation Research Institute of the Hungarian Academy of Sci- 
ences 

LSH Forest: Self-Tuning Indexes for Similarity Search 

Mayank Bawa*, Tyson Condie! and Prasanna Ganesan* 

*Stanford University 

t University of California, Berkeley 
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Partitioning of Web Graphs by Community Topology 
Hidehiko Ino, Mineichi Kudo and Atsuyoshi Nakamura, Hokkaido University 


PT24| XML Parsing and Stylesheets 
Location: Room 303 
Session Chair: Makoto Murata, IBM Tokyo Research Lab 


Incremental Maintenance for Materialized XPath/XSLT Views 

Makoto Onizuka*, Fong Yee Chan!, Ryusuke Michigami* and Takashi Honishi* 

*NTT CyberSpace Laboratories 

t Simon Fraser University 

*Plala Networks Inc. 

Compiling XSLT 2.0 into XQuery 1.0 

Achille Fokoue, Kristoffer Rose, Jerome Simeon and Lionel Villard, IBM T.J. Watson Research 
Center 

An Adaptive, Fast, and Safe XML Parser Based on Byte Sequences Memorization 
Toshiro Takase, Hisashi Miyashita, Toyotaro Suzumura and Michiaki Tatsubori, IBM Tokyo 
Research Laboratory 


PANELO8 | Web engineering: technical discipline or social process 
Location: Room 201 
Moderator: Bebo White, Stanford Linear Accelerator Center, USA 
Panelists: David Lowe, University of Technology, Sydney 
Martin Gaedke, University of Karlsruhe 
Daniel Schwabe, PUC Rio de Janeiro 
Yogesh Deshpande, University of Western Sydney 


This panel aims to explore the nature of the emerging Web engineering discipline. It will 
attempt to strongly engage with the issue of whether Web Engineering is currently, and (more 
saliently) should be in the future, viewed primarily as a technical design discipline with its 
attention firmly on the way in which Web technologies can be leveraged in the design process, 
or whether it should be viewed primarily as a socio-positioned discipline which focuses on the 
nature of the way in which projects are managed, needs are understood and users interact. 


W3C08 | Interaction and the Web: The Future Browser 
Location: International Conference Room 
Session Chair: Steven Pemberton, W3C Interaction Domain Team 
Speakers: Steven Pemberton, W3C Interaction Domain Team 

Bert Bos, W3C 

TV Raman, IBM 

Mark Birbeck, x-port.net 

Dean Jackson, W3C 


As new WSC technologies begin to come online in browsers, new possibilities open for how 
browsers can be used and applied, across devices, and for new purposes. This session explores 
some of these new directions. 
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Friday 


15:00 — 15:30 
Break 


15:30 — 17:00 Parallel Sessions 


PT25| Schemas and Semantics 
Location: Room 301 
Session Chair: Wolfgang Nejdl, L3S and University of Hannover 


CaTTS: Calendar Types and Constraints for Web Applications 
Francois Bry, Frank-André Rief and Stephanie Spranger, University of Munich 


Expressiveness of XSDs: from Practice to Theory, There and Back Again 
Geert Jan Bex*, Wim Martens". Frank Neven" and Thomas Schwentick? 

* Limburgs Universitair Centrum 

t Philipps Universitaet Marburg 


WEESA - Web Engineering for Semantic Web Applications 
Gerald Reif*, Harald Gall? and Mehdi Jazayeri* 

* Distributed Systems Group, Vienna University of Technology 

t Department of Informatics, University Zurich 


PT26| Architecture and Implementation of Web sites 
Location: Room 302 
Session Chair: Fred Douglis, JBM Research 


A Multi-Threaded PIPELINED Web Server Architecture for SMP/SoC Machines 
Gyu Sang Choi, Jin-Ha Kim, Deniz Ersoz and Chita R. Das, Pennsylvania State University 


Cataclysm: Policing Extreme Overloads in Internet Applications 
Bhuvan Urgaonkar and Prashant Shenoy, University of Massachusetts 


Design for Verification for Asynchronously Communicating Web Services 
Aysu Betin-Can*, Tevfik Bultan* and Xiang Fut 

* University of California at Santa Barbara 

t Georgia Southwestern State University 


PT27 | Embedded Web 
Location: Room 303 
Session Chair: Tatsuya Hagino, Keio Univerisy 


Need for Non-Visual Feedback with Long Response Times in Mobile HCI 

Virpi Roto, Nokia Research Center 

Antti Oulasvirta, Helsinki Institute of Information Technology 

An environment for collaborative contents aquisition and editing by coordinated 
ubiquitous devices 

Yutaka Kidawara, NICT 

Tomoyuki Uchiyama, Kyoto University 

Katsumi Tanaka, NICT and Kyoto University 
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Friday 


PANELO9 | Web services considered harmful? 
Location: Room 201 
Moderator: Rohit Khare, CommerceNet Labs, USA 
Panelists: Jeff Barr, Amazon Web Services 

Mark Baker, Developer’s Day Chair 

Adam Bosworth, Google 

Tim Bray, Sun Microsystems 

Jeffery McManus, eBay Web Services 


It has been estimated that all of the Web Services specifications and proposals (“WS-*”) weigh 
in at several thousand pages by now. At the same time, their predecessor technologies such 
as XML-RPC have developed alongside other *grassroots" technologies like RSS. This debate 
has arguably even risen to the architectural level, contrasting “service-oriented architectures” 
with REST-based architectural styles. Unfortunately, the multiple overlapping specifications, 
standards bodies, and vendor strategies tend to obscure the very real successes of providing 
machine-automatable services over the Web today. This panel asks: Are current community 
processes for developing, debating, and adopting Web Services are helping or hindering the 
adoption of Web Services technology? 


URL: http://labs.commerce.net /wiki/images/1/19/CN-TR-04-05.pdf 


W3C09 | Questions & Answers to the W3C Members and Team 
Location: International Conference Room 
Session Chair: Steve Bratt, W3C Chief Operating Officer 

Speaker: all W3C Track’05 session chairs and speakers 


17:10 — 18:00 
Closing Ceremony 
Location: Convention Hall 
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Saturday — Developers’ Day Tutorials 


TITLE PRESENTERS LOCATION 
TDO1 | Web Bloopers — Avoiding Common | Jeff Johnson 101 
Web Design Mistakes 
TDO2 | Current Best Practices in Web Develop- | (with WOW Web Profes- 303 
ment and Design sional Certification Exam Op- 
tion) 


TDO01 | Web Bloopers: Avoiding Common Web Design Mistakes 
Jeff Johnson, UI Wizards, Inc. 
Location: 101 


This tutorial is based on the presenter’s new book: Web Bloopers: 60 Common Web Design Mistakes 
and How to Avoid Them (Morgan Kaufmann, 2003). The book explains how to avoid common Web 
design errors, illustrated with examples from actual websites. The tutorial, like the book, organizes 
bloopers into categories: Content, Task-Support, Navigation, Form, Search, Text & Writing, Link 
Presentation, and Graphic and Layout. It includes class exercises in which participants review 
actual websites looking for bloopers and discuss how to improve them. The tutorial is intended 
for Web designers and developers, mainly those who lack several years of experience designing and 
evaluating websites and Web applications. Others who might benefit from this tutorial are web 
Q/A engineers, usability testers, and web development managers. After completing this full-day 
tutorial, participants will: 


e Have seen the most common Web design errors and ways to avoid them; 
e Be able to recognize those errors in websites and Web applications; 


e Be better designers and customers of websites and online services. 


TDO2|} Current Best Practices in Web Development and Design 
(with WOW Web Professional Certification Exam Option) 
Location: 304 


The advancement of Web-based technologies is exciting, yet how do designers and developers work- 
ing day to day adopt contemporary practices into existing workflow? This session’s goal is to focus 
on the hot topics in Web design and development as they pertain to the practical application of 
progressive web technologies. An emphasis on semantic markup, CSS, accessibility, as well as a 
concern for esthetic choices for screen, alternative devices, and even print will be discussed. 


Key Objectives: During these sessions you will: 
e Hone in on workflow problems in the industry, and learn about new models that can aid in 
improving your workflow. 
e Learn contemporary design & development technologies and how they can serve you. 


e Learn about Cascading Style Sheets (CSS) and how they are so critical not just for designers, 
but for web developers and site managers, too. 


e Gain insight into current tools, including web browsers, and how to work with them more 
effectively. 
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Saturday — Developers’ Day Tutorials 


e Understand the benefits of Web standards and best practices, and learn how technical and ROI 
performance improves with their implementation. 


Participants will have the option to take one of two WOW certification exams at the end of the 
session, the Certified Professional Webmaster (CPW), or the Certified Professional Web Designer 
(CPWD) exam. For further information about WOW’s full day event, WOW’s certification exams, 
more about the WOW organization and additional certification resources and objectives visit: 

e Japanese version: http://www.webprofessionals.org/www2005jp 


e English version: http: //www.webprofessionals.org/www2005 
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Saturday — Developers’ Day 


9:00 — 10:00 
Keynote Speech 


Location: Room 201 


One project, four schema languages; medley or melee? 
Makoto Murata, International University of Japan 


10:30 — 12:00 Parallel Sessions 
DDO1| Semantic Web 


Location: Room 201 
Session Chair: Jim Hendler, University of Maryland 


DD02 | Microformats 


Location: Room 301A 
Session Chairs: Tantek Celik, Technorati 


Eric Meyer, Complex Spriral Consulting 


DDO3| Interaction and Visualization 


Location: Room 301B 
Session Chairs: Joyce Park, CommerceNet Labs 


Kevin Hughes, CommerceNet Labs 
Mark Baker, Coactus, Canada 


DD04 | Semantic Web Services in Practice 


Location: Room 302 
Session Chairs: Terry Payne, Southampton University 


Evren Sirin, University of Maryland 


13:30 — 15:00 Parallel Sessions 
DDO5| Semantic Web 


Location: Room 201 


DDO6 | Microformats 
Location: Room 301A 


DDO7 | Interaction and Visualization 
Location: Room 301B 


DD08 | Semantic Web Services in Practice 
Location: Room 302 


15:30 — 16:15 Parallel Sessions 
DDO9| Semantic Web 


Location: Room 201 
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DD10 


Saturday — Developers’ Day 


Microformats 


Location: Room 301A 


15:30 — 17:00 


DD11 


Interaction and Visualization 


Location: Room 301B 


16:15 — 17:00 


DD12 


(Panel) Semantic Web and Microformats 


Location: Room 201 
Moderator: Mark Baker, Coactus, Canada 
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Posters 


Posters 
Location: Exhibition Hall 8 


07 Applications and User Interface 


10 Security 08 XML and Web Services 


09 Web Engineering 


0536 | 0537 i 0538 | 0539 0540 
) 0529 | 0530 0531 | 0532 0534 | 0535 
06 Saivantic Web 05 Search and Data Mining 
| 0523 | 0524 0525 | 0526 0527 | 0528 
0517 | 0518 R 0519 | 0520 o 0521 | 0522 
04 Performance 
0511 | 0512 0513 | 0514 0515 | 0516 
0505 | 0506 0507 | 0508 0509 | 0510 
01 E-Commerce 


03 Networking 02 Multimedia 


t 


E-Commerce 0202 Multichannel publication of interactive media 
documents in a news environment 
Tom Beckers*, Nico Oorts!, Filip Hendrickx? and Rik 
Van de Walle* 
* Ghent University-IBBT 
TVRT 


tIMEC 


0101 Designing Learning Services: From 
Content-based to Activity-based Learning 
Systems 
Pythagoras Karampiperis and Demetrios Sampson, 
Advanced e-Services for the Knowledge Society 
Research Unit, Informatics and Telematics Institute 

0203 Multi-Step Media Adaptation : 
Implementation of a Knowledge-Based Engine 
Peter Soetens and Matthias De Geyter, VRT 


0102 How much is a Keyword worth? 
Ramesh Sarukkai, Yahoo! Inc. 


0204 Personal TV Viewing by Using Live Chat as 
Metadata 
Hisashi Miyamori*, Satoshi Nakamura* and Katsumi 
Tanakat 
* National Institute of Information and 
Communications Technology (NICT) 
t National Institute of Information and 
Communications Technology (NICT) and Kyoto 
University 


Multimedia 


0201 Accuracy Enhancement of Function oriented 
Classification of Web Images 
Koji Nakahira, Toshihiko Yamasaki and Kiyoharu 
Aizawa, Department of Frontier Informatics, The 
University of Tokyo 
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0205 Video Quality Estimation for Internet 
Streaming 
Amy Reibman, Subhabrata Sen and Jacobus van der 
Merwe, AT&T Labs-Research 


0206 Webified Video: Media Conversion from TV 
Program to Web Content and their Integrated 
Viewing Method 
Hisashi Miyamori, National Institute of Information 
and Communications Technology (NICT) 

Katsumi Tanaka, National Institute of Information 
and Communications Technology (NICT) and Kyoto 
University 


Networking 


0301 A Publish and Subscribe Collaboration 
Architecture for Web-Based Information 
M. Brian Blake*, David H. Fado! and Gregory A. 
Mackt 
*Department of Computer Science, Georgetown 
University 
t Advanced Systems and Concepts, Science 
Applications International Corp. (SAIC) 


0302 An Adaptive Middleware Infrastructure for 
Mobile Computing 
Ronnie Cheung, Department of Computing, Hong 
Kong Polytechnic University 


0303 An Approach for Realizing Privacy-Preserving 
Web-Based Services 
Wei Xu*, R. Sekar*, I.V. Ramakrishnan* and V.N. 
Venkatakrishnant 
*Department of Computer Science, Stony Brook 
University 
İt Department of Computer Science, University of 
Illinois at Chicago 


0304 Application Networking on Peer-to-Peer 
Networks 
Mu Su and Chi-Hung Chi, School of Computing, 
National University of Singapore 


0305 Automatic Generation of Web Portals Using 
Artificial Ants 
Hanene Azzag*, Gilles Venturni* and Christiane 
Guinott 
*Laboratoire d’Informatique, Polytech Tours 
TCE.R.ILE.S 


0306 Data Versioning Techniques for Internet 
Transaction Management 
Ramkrishna Chatterjee and Gopalan Arun, Oracle 
Corporation 


0307 Design and Implementation of A Feedback 
Controller for Slowdown Differentiation on 
Internet Servers 
Jianbin Wei and Cheng-Zhong Xu, Department of 
Electrical and Computer Engineering, Wayne State 
University 


0308 Exploiting the Web for Point-in-Time File 
Sharing 
Roberto J. Bayardo Jr., IBM Almaden Research 
Center 
Sebastian Thomschke, IBM Deutschland GmbH 


0309 Finding Group Shilling in Recommendation 
System 
Xue-Feng Su*, Hua-Jun Zeng! and Zheng Chen! 
* Computer Science and Technology, Beijing 
University of Posts and Telecommunications 
t Microsoft Research Asia 


0310 Improved Timing Control for Web Server 
Systems Using Internal State Information 
Xue Liu, Rong Zheng, Jin Heo and Lui Sha, 
Department of Computer Science, University of 
Illinois at Urbana-Champaign 


0311 Information Flow using Edge Stress Factor 
Franco Salvetti, University of Colorado at Boulder 
Savitha Srinivasan, [BM Almaden Research Center 


0312 Predicting Navigation Patterns on the 
Mobile-Internet Using Time of the Week 
Martin Halvey, Mark T. Keane and Barry Smyth, 
Adaptive Information Cluster, Department of 
Computer Science, University College Dublin 


0313 WAND: A Meta-data Maintenance System 
over the Internet 
Anubhav Bhatia, Saikat Mukherjee, Saugat Mitra and 
Srinath Srinivasa, Indian Institute of Technology, 
Bangalore 


0314 Web Page Marker: a Web Browsing Support 
System based on Marking and Anchoring 
Takahiro Koga, Noriharu Tashiro, Tadachika Ozono, 
Takayuki Ito and Toramatsu Shintani, Department of 
Computer Science and Engineering, Nagoya Institute 
of Technology 

0315 Web Resource Geographic Location 

Classification and Detection 

Chuang Wang", Xing Xie!, Lee Wang!*, Yansheng Lu* 

and Wei-Ying Mat 

*Department of Computer Science, Huazhong 

University of Science and Technology 

t Microsoft Research Asia 

Microsoft Corporation 
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Performance 


0401 A Comprehensive Comparative Study on Term 


Weighting Schemes for Text Categorization 
with Support Vector Machines 

Man Lan*, Chew-Lim Tant, Hwee-Boon Low" and 
Sam-Yuan Sungt 

* Institute for Infocomm Research 

t Department of Computer Science, National 
University of Singapore 


0402 


Eric C. Jensen*, Steven M. Beitzel*, Ophir Frieder* 
and Abdur Chowdhuryt 


* Information Retrieval Laboratory, Illinois Institute of 


Technology 
t Search & Navigation Group, America Online Inc. 


0403 Applying NavOptim to Minimise Navigational 
Effort 
David Lowe and Xiaoying Kong, University of 
Technology, Sydney 


0404 Boosting SVM Classifiers By Ensemble 
Yan-Shi Dong, Shanghai Jiao Tong University 
Ke-Song Han, Motorola Labs, China Research Center 


0405 Bootstrapping Ontology Alignment Methods 
with APFEL 
Marc Ehrig*, Steffen Staab? and York Sure* 
* Institute AIFB, University of Karlsruhe 
T ISWeb, University of Koblenz-Landau 


0406 Can Link Analysis Tell Us about Web Traffic? 
Marcin Sydow, Polish-Japanese Institute of 
Information Technology 


0407 Clustering for Probabilistic Model Estimation 
for CF 
Qing Li Kumoh*, Byeong Man Kim! and Sung Hyon 
Myaeng* 
* National Institute of Technology, Information & 
Communication University, Korea 
t Kumoh National Institute of Technology 


0408 Efficient Structural Joins with On-The-Fly 
Indexing 
Kun-Lung Wu, Shyh-Kwei Chen and Philip S. Yu, 
IBM T. J. Watson Research Center 


A Framework for Determining Necessary Query 
Set Sizes to Evaluate Web Search Effectiveness 


0409 Finding The Search Engine That Works For 
You 
Kin F. Li*, Wei Yu*, Shojiro Nishiot and Yali Wang" 
*Department of Electrical & Computer Engineering, 
University of Victoria 
İt Graduate School of Information Science & 
Technology, Osaka University 


0410 Improving Text Collection Selection with 
Coverage and Overlap Statistics 
Thomas Hernandez and Subbarao Kambhampati, 
Department of Computer Science and Engineering, 
Arizona State University 


0411 Predicting Outcomes of Web Navigation 
Jacek Gwizdka and Ian Spence, Department of 
Psychology, University of Toronto 


0412 Site Abstraction for Rare Category 
Classification in Large-Scale Web Directory 
Tie-Yan Liu*, Hao Wan‘, Tao Qin!, Zheng Chen*, 
Yong Ren! and Wei-Ying Ma* 

* Microsoft Research Asia 
T Department of Electronic Engineering, Tsinghua 
University 

0413 The WT10G dataset and the evolution of the 

Web 

Wei-Tsen Milly*, Markus Hagenbuchner* and Ah 

Chung Tsoit 

* University of Wollongong 

t Australian Research Council 


0414 TotalRank: Ranking Without Damping 
Paolo Boldi, DSI, Universita degli Studi di Milano 


0415 WCAG Formalization with W3C Standards 
Vicente Luque Centeno*, Carlos Delgado Kloos*, 
Martin Gaedke! and Martin Nussbaumert 
*Carlos III University of Madrid 
İt University of Karlsruhe 


Search and Data Mining 


0501 A Clustering Method for News Articles 
Retrieval System 
Hiroyuki Toda and Ryoji Kataoka, NTT Cyber 
Solutions Laboratories, NTT Corporation 


0502 A More Precise Model of Web Retrieval 
Junli Yuan, Institute for Infocomm Research, School of 
Computing, National University of Singapore 
Chi-Hung Chi, School of Computing, National 
University of Singapore 
Qibin Sun, Institute for Infocomm Research 
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0503 Adaptive Page Ranking with Neural Networks 0513 Building an Open Source Meta-Search Engine 


Franco Scarselli*, Sweah Liang Yong!, Markus 
Hagenbuchner! and Ah Chung Tsoi 

* University of Siena 

t University of Wollongong 

t Australian Research Council 


0504 Adaptive Query Routing in Peer Web Search 
Le-Shin Wu*, Ruj Akavipat* and Filippo Menczert 


* Department of Computer Science, Indiana University 


t School of Informatics and Department of Computer 
Science, Indiana University 


0505 An Analysis of Search Engine Switching 
Behavior using Click Streams 
Yun-Fang Juan and Chi-Chao Chang, Yahoo! Inc. 


0506 An Economic Model of Web Search 
Georgios Kouroupas*, Elias Koutsoupias!, Christos 
Papadimitriou? and Martha Sideri* 
* Athens University of Economics and Business 
t University of Athens 
t University of California, Berkeley 


0507 An Information Extraction Engine for Web 
Discussion Forums 
Hanny Yulius Limanto, Nguyen Ngoc Giang, Vo Tan 
Trung, Nguyen Quang Huy, Jun Zhang and Qi He, 
Nanyang Technological University 


0508 Analysis of Topic Dynamics in Web Search 
Xuehua Shen*, Susan Dumais! and Eric Horvitzt 
* Department of Computer Science, University of 
Illinois 
t Microsoft Research 


0509 Analyzing Online Discussion for Marketing 
Intelligence 
Natalie Glance, Matthew Hurst, Kamal Nigam, 
Matthew Siegler, Robert Stockton and Takashi 
Tomokiyo, Intelliseek Applied Research Center 


0510 Analyzing Web Page Headings Considering 
Various Presentation 
Yushin Tatsumi and Toshiyuki Asahi, NEC Internet 
Systems Research Laboratories 


0511 Automatically Learning Document Taxonomies 


for Hierarchical Classification 

Kunal Punera, Suju Rajan and Joydeep Ghosh, 
Department of Electrical and Computer Engineering, 
University of Texas at Austin 


0512 BackRank: an Alternative for PageRank 
Mohamed Bouklit, LIRMM 
Fabien Mathieu, LIRMM-INRIA 


Antonio Gulli, Dipartimento di Informatica, 
University of Pisa 

Alessio Signorini, University of Iowa, Computer 
Science 


0514 Comparing Relevance Feedback algorithms for 


Web Search 

Vishwa Vinay*, Ken Wood’, Natasa Milic-Fraylingt 
and Ingemar Cox* 

*University College London 

İt Microsoft Research Cambridge 


0515 Cyclone: An Encyclopedic Web Search Site 


Atsushi Fujii*, Katunobu Tou! and Tetsuya Ishikawa* 
*Graduate School of Library, Information and Media 
Studies, University of Tsukuba 

t Graduate School of Information Science, Nagoya 
University 


0516 Delivering new web content reusing remote and 


heterogeneous sites. A DOM-based approach 
Luis Alvarez Sabucedo and Luis Anido Rifén, 
Universidade de Vigo, Departamento Telemática 


0517 Exploiting the Deep Web with DynaBot: 


Matching, Probing, and Ranking 

Daniel Rocco*, James Caverleet, Ling Liu? and 
Terence Critchlow? 

*University of West Georgia 

İt Georgia Institute of Technology 

t Lawrence Livermore National Laboratory 


0518 Extracting Context To Improve Accuracy For 


HTML Content Extraction 

Suhit Gupta*, Gail Kaiser* and Salvatore Stolfot 
* Programming Systems Lab, Columbia University 
T Intrusion Detection Lab, Columbia University 


0519 Focused Crawling By Exploiting Anchor Text 


Using Decision Tree 

Jun Li, Department of General System Studies, The 
University of Tokyo 

Kazutaka Furuse, Institute of Information Sciences € 
Electronics, University of Tsukuba 

Kazunori Yamaguchi, Information Technology Center, 
The University of Tokyo 


0520 From User-Centric Web Traffic Data to Usage 


Data 
Thomas Beauvisage and Houssem Assadi, France 
Telecom R&D 
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0521 


0522 


0523 


0524 


0525 


0526 


0527 


Posters 


Incremental Page Rank Computation on 
Evolving Graphs 

Prasanna Desikan*, Nishith Pathak!, Delhi Jaideep 
Srivastava* and Vipin Kumar* 

* Department of Computer Science, University of 
Minnesota 

İt Department of Computer Science, University of 
Minnesota & Indian Institute of Technology 


Information Retrieval in P2P Networks Using 
Genetic Algorithm 

Wan Yeung Wong, Tak Pang Lau and Irwin King, 
Department of Computer Science & Engineering, The 
Chinese University of Hong Kong 


Learning How to Learn with Web Contents 
Akihiro Kashihara, The University of 
Electro-Communications 

Shinobu Hasegawa, Research Center for Distance 
Learning, Japan Advanced Institute of Science and 
Technology 


METEOR: Metadata and Instance Extraction 
from Object Referral Lists on the Web 

Hasan Davulcu*, Srinivas Vadrevu*, Saravanakumar 
Nagarajan* and Fatih Gelgi! 

* Department of Computer Science and Engineering, 
Arizona State University 

t Department of Computer Science and Engineering, 
Arizona State University, 


Mining Directed Social Network from Message 
Board 


0528 On the Feasibility of Low-rank Approximation 


0529 


0530 


0531 


for Personalized PageRank 

Andras Bencztr*, Karoly Csalogány* and Tamas 
Sarlós! 

* Eötvös University 

+ Computer and Automation Institute, Hungarian 
Academy of Sciences 


Predictive Ranking: A Novel Page Ranking 
Approach by Estimating the Web Structure 
Haixuan Yang*, Irwin King! and Michael R. Lyu! 

* Department of Computer Science and Engineering, 
The Chinese University of Hong Kong 

t Department of Computer Science and Engineering 
The Chinese University of Hong Kong 


Preferential Walk: Towards Efficient and 
Scalable Search in Unstructured Peer-to-Peer 
Networks 

Hai Zhuge, Xue Chen and Xiaoping Sun, Key Lab of 
Intelligent Information Processing, Institute of 
Computing Technology, Chinese Academy of Sciences 


Representing Personal Web Information as a 
'Topic-Oriented Interface 

Zhigang Hua*, Hao Liu!, Xing Xie*, Hanging Luz and 
Wei-Ying Mat 

* [nstitute of Automation, Chinese Academy of 
Sciences 

İt Department of Information Engineering, Chinese 
University of Hong Kong 

t Microsoft Research Asia 


Naohiro Matsumura*, David E. Goldberg! and Xavier 0532 Retrieving Multimedia Web Objects Based on 


Llorat 
* Osaka University 
tUniversity of Illinois at Urbana-Champaign 


Mining Web Site’s Topic Hierarchy 

Nan Liu and Christopher C. Yang, Department of 
Systems Engineering and Engineering Management, 
The Chinese University of Hong Kong 


Modeling the Author Bias Between Two 
On-line Computer Science Citation Databases 
Vaclav Petricek*, Ingemar J. Cox*, Hui Hant, Isaac 
Councill* and C. Lee Giles? 

*University College London 

t Yahoo! Inc. 

t The School of Information Sciences and Technology, 
The Pennsylvania State University 


0533 


0534 
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Page Rank Algorithm 

Christopher C. Yang and K. Y. Chan, Department of 
Systems Engineering and Engineering Management, 
The Chinese University of Hong Kong 


SAT-MOD: Moderate Itemset Fittest for Text 

Classification 

Jianlin Feng, Huijun Liu and Jing Zou, Department of 
Computer Science, Huazhong University of Science & 
Technology 


Semantic Search of Schema Repositories 
Tanveer Syeda-Mahmood*, Gauri Shah*, Lingling 
Yan! and Willi Urban! 

*IBM Almaden Research Center 

TIBM SVL 

tIBM Software Group 


Posters 


0535 The Indexable Web is More than 11.5 Billion 
Pages 
Antonio Gulli, Universia di Pisa, Dipartimento di 
Informatica 
Alessio Signorini, University of Iowa, Computer 
Science 


0536 TruRank: Taking PageRank to the Limit 
Sebastiano Vigna, Dipartimento di Scienze 
dell’Informazione, Universita degli Studi di Milano 


0537 Using Visual Cues for Extraction of Tabular 
Data from Arbitrary HTML Documents 
Bernhard Krüepl, Marcus Herzog and Wolfgang 
Gatterbauer, Vienna University of Technology 


0605 


0538 Web Data Cleansing for Information Retrieval 
using Key Resource Page Selection 
Yiqun Liu, Canhui Wang, Min Zhang and Shaoping 
Ma, State Key Lab of Intelligent technology & 
systems, Tsinghua University 


0539 Web Log Mining with Adaptive Support 
Thresholds 
Jian-Chih Ou*, Chang-Hung Leet and Ming-Syan 
Chen* 
*Department of Electrical Engineering, National 
Taiwan University 

BenQ Corporation 18 


0606 


0540 XAR-Miner: Efficient Association Rules 
Mining for XML Data 

Sheng Zhang*, Ji Zhang, Han Liu? and Wei Wang? 
*College of Physics Sciences and Technology, Nanjing 
Normal University, Nanjing, China 

Department of Computer Science, University of 
Toronto 

College of Educational Science, Nanjing Normal 
University, Nanjing, China 


0607 


0608 


Semantic Web 9609 


0601 An Agent System for Ontology Sharing on 
Www 

Kotaro Nakayama, Takahiro Hara and Shojiro Nishio, 
Graduate School of Information Science and 
Technology, Osaka University 

0602 An Approach for Ontology-based Elicitation of 0610 
User Models to Enable Personalization on the 
Semantic Web 

Ronald Denaux*, Lora Aroyo* and Vania Dimitrovat 
*Department of Computer Science, Eindhoven 
University of Technology 

t School of Computing, University of Leeds 


0611 
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0603 An Architecture for Personal Semantic Web 


Information Retrieval System 

Haibo Yu*, Tsunenori Minet and Makoto Amamiyat 
*Graduate School of Information Science and 
Electrical Engineering, Kyushu University 

t Faculty of Information Science and Electrical 
Engineering, Kyushu University 


0604 Association Search in Semantic Web: Search + 


Inference 
Liang Bangyong, Tang Jie and Li Juanzi, Department 
of Computer Science, Tsinghua University 


Automated Semantic Web Services 
Orchestration via Concept Covering 

Tommaso Di Noia*, Eugenio Di Sciascio*, Francesco 
M. Donini?, Azzurra Ragone* and Simona Colucci? 
* Politecnico di Bari 

t Università della Tuscia 

t Politecnico di Bari and Knowledge Management 
Institute - Open University 


AVATAR: An approach based on Semantic 
Reasoning to Recommend Personalized TV 
programs 

Yolanda Blanco-Fernández, Jose J. Pazos-Arias, 
Alberto Gil-Solla, Manuel Ramos-Cabrer, Ana 
Fernández-Vilas, Rebeca P. Díaz-Redondo, Martín 
López-Nores and Belén Barragáns-Martínez, 
Department of Telematic Engineering, University of 
Vigo 


Constructing Extensible XQuery Mappings 
Gang Qian and Yisheng Dong, Department of 
Computer Science and Engineering, Southeast 
University 


Hera Presentation Generator 
Flavius Frasincar, Geert-Jan Houben and Peter Barna, 
Eindhoven University of Technology 


Hybrid Semantic Tagging for Information 
Extraction 

Ronen Feldman*, Benjamin Rosenfeld*, Moshe 
Fresko* and Brian D. Davison! 

* Computer Science Department, Bar-Ilan University 
t Computer Science and Engineering, Lehigh 
University 


Multiple Strategies Detection in Ontology 
Mapping 

Jie Tang, Bang-Yong Liang and Juan-Zi Li, 
Department of Computer, Tsinghua University 


Semantic Virtual Environments 
Karsten A. Otto, Freie Universitat Berlin 


Posters 


0612 Service Discovery and Measurement based on 
DAML-QoS Ontology 


Chen Zhou, Liang-Tien Chia and Bu-Sung Lee, Center 


for Multimedia & Network Technology, Nanyang 
Technological University 


0613 Signing individual fragments of an RDF graph 


Giovanni Tummarello, Christian Morbidoni, Paolo 
Puliti and Francesco Piazza, Universita Politecnica 
delle Marche 


0614 Soundness Proof of Z Semantics of OWL Using 


Institutions 
Dorel Lucanu*, Yuan Fang Lit and Jin Song Dongt 


* Faculty of Computer Science, *A.I.Cuza" University 


t School of Computing, National University of 
Singapore 


0615 Verify Feature Models using Protege-OWL 
Hai Wang, The University of Manchester 
Yuan Fang Li, National University of Singapore 
Jing Sun, The University of Auckland 
Hongyu Zhang, RMIT University 


Applications and User Interface 


0701 A Language for Expressing User-Context 
Preferences in the Web 
Juan Ignacio Vazquez and Diego Lopez de Ipina, 
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